Search | Nursing VHL Search Portal

1.

Follicular lymphoma evolves with a surmountable dependency on acquired glycosylation motifs in the B-cell receptor.

Haebe, Sarah; Day, Grady; Czerwinski, Debra K; Sathe, Anuja; Grimes, Susan M; Chen, Tianqi; Long, Steven R; Martin, Brock; Ozawa, Michael G; Ji, Hanlee P; Shree, Tanaya; Levy, Ronald.

Blood ; 142(26): 2296-2304, 2023 12 28.

Article in English | MEDLINE | ID: mdl-37683139

ABSTRACT

ABSTRACT: An early event in the genesis of follicular lymphoma (FL) is the acquisition of new glycosylation motifs in the B-cell receptor (BCR) due to gene rearrangement and/or somatic hypermutation. These N-linked glycosylation motifs (N-motifs) contain mannose-terminated glycans and can interact with lectins in the tumor microenvironment, activating the tumor BCR pathway. N-motifs are stable during FL evolution, suggesting that FL tumor cells are dependent on them for their survival. Here, we investigated the dynamics and potential impact of N-motif prevalence in FL at the single-cell level across distinct tumor sites and over time in 17 patients. Although most patients had acquired at least 1 N-motif as an early event, we also found (1) cases without N-motifs in the heavy or light chains at any tumor site or time point and (2) cases with discordant N-motif patterns across different tumor sites. Inferring phylogenetic trees of the patients with discordant patterns, we observed that both N-motif-positive and N-motif-negative tumor subclones could be selected and expanded during tumor evolution. Comparing N-motif-positive with N-motif-negative tumor cells within a patient revealed higher expression of genes involved in the BCR pathway and inflammatory response, whereas tumor cells without N-motifs had higher activity of pathways involved in energy metabolism. In conclusion, although acquired N-motifs likely support FL pathogenesis through antigen-independent BCR signaling in most patients with FL, N-motif-negative tumor cells can also be selected and expanded and may depend more heavily on altered metabolism for competitive survival.

Subject(s)

Lymphoma, Follicular , Humans , Lymphoma, Follicular/pathology , Glycosylation , Phylogeny , Receptors, Antigen, B-Cell/genetics , Receptors, Antigen, B-Cell/metabolism , Lectins , Tumor Microenvironment

2.

KmerKeys: a web resource for searching indexed genome assemblies and variants.

Pavlichin, Dmitri S; Lee, HoJoon; Greer, Stephanie U; Grimes, Susan M; Weissman, Tsachy; Ji, Hanlee P.

Nucleic Acids Res ; 50(W1): W448-W453, 2022 07 05.

Article in English | MEDLINE | ID: mdl-35474383

ABSTRACT

K-mers are short DNA sequences that are used for genome sequence analysis. Applications that use k-mers include genome assembly and alignment. However, the wider bioinformatic use of these short sequences has challenges related to the massive scale of genomic sequence data. A single human genome assembly has billions of k-mers. As a result, the computational requirements for analyzing k-mer information is enormous, particularly when involving complete genome assemblies. To address these issues, we developed a new indexing data structure based on a hash table tuned for the lookup of short sequence keys. This web application, referred to as KmerKeys, provides performant, rapid query speeds for cloud computation on genome assemblies. We enable fuzzy as well as exact sequence searches of assemblies. To enable robust and speedy performance, the website implements cache-friendly hash tables, memory mapping and massive parallel processing. Our method employs a scalable and efficient data structure that can be used to jointly index and search a large collection of human genome assembly information. One can include variant databases and their associated metadata such as the gnomAD population variant catalogue. This feature enables the incorporation of future genomic information into sequencing analysis. KmerKeys is freely accessible at https://kmerkeys.dgi-stanford.org.

Subject(s)

Algorithms , Sequence Analysis, DNA , Software , Humans , Genome, Human , Genomics/methods , Sequence Analysis, DNA/methods

3.

Short Tandem Repeat DNA Profiling Using Perylene-Oligonucleotide Fluorescence Assay.

Hernández Bustos, Adrián; Martiny, Elisa; Bom Pedersen, Nadia; Parvathaneni, Rohith Pavan; Hansen, Jonas; Ji, Hanlee P; Astakhova, Kira.

Anal Chem ; 95(20): 7872-7879, 2023 05 23.

Article in English | MEDLINE | ID: mdl-37183373

ABSTRACT

We report an amplification-free genotyping method to determine the number of human short tandem repeats (STRs). DNA-based STR profiling is a robust method for genetic identification purposes such as forensics and biobanking and for identifying specific molecular subtypes of cancer. STR detection requires polymerase amplification, which introduces errors that obscure the correct genotype. We developed a new method that requires no polymerase. First, we synthesized perylene-nucleoside reagents and incorporated them into oligonucleotide probes that recognize five common human STRs. Using these probes and a bead-based hybridization approach, accurate STR detection was achieved in only 1.5 h, including DNA preparation steps, with up to a 1000-fold target DNA enrichment. This method was comparable to PCR-based assays. Using standard fluorometry, the limit of detection was 2.00 ± 0.07 pM for a given target. We used this assay to accurately identify STRs from 50 human subjects, achieving >98% consensus with sequencing data for STR genotyping.

Subject(s)

DNA Fingerprinting , Perylene , Humans , DNA Fingerprinting/methods , Oligonucleotides , Biological Specimen Banks , Microsatellite Repeats , DNA/genetics , Genotype

4.

Single-cell analysis can define distinct evolution of tumor sites in follicular lymphoma.

Haebe, Sarah; Shree, Tanaya; Sathe, Anuja; Day, Grady; Czerwinski, Debra K; Grimes, Susan M; Lee, HoJoon; Binkley, Michael S; Long, Steven R; Martin, Brock; Ji, Hanlee P; Levy, Ronald.

Blood ; 137(21): 2869-2880, 2021 05 27.

Article in English | MEDLINE | ID: mdl-33728464

ABSTRACT

Tumor heterogeneity complicates biomarker development and fosters drug resistance in solid malignancies. In lymphoma, our knowledge of site-to-site heterogeneity and its clinical implications is still limited. Here, we profiled 2 nodal, synchronously acquired tumor samples from 10 patients with follicular lymphoma (FL) using single-cell RNA, B-cell receptor (BCR) and T-cell receptor sequencing, and flow cytometry. By following the rapidly mutating tumor immunoglobulin genes, we discovered that BCR subclones were shared between the 2 tumor sites in some patients, but in many patients, the disease had evolved separately with limited tumor cell migration between the sites. Patients exhibiting divergent BCR evolution also exhibited divergent tumor gene-expression and cell-surface protein profiles. While the overall composition of the tumor microenvironment did not differ significantly between sites, we did detect a specific correlation between site-to-site tumor heterogeneity and T follicular helper (Tfh) cell abundance. We further observed enrichment of particular ligand-receptor pairs between tumor and Tfh cells, including CD40 and CD40LG, and a significant correlation between tumor CD40 expression and Tfh proliferation. Our study may explain discordant responses to systemic therapies, underscores the difficulty of capturing a patient's disease with a single biopsy, and furthers our understanding of tumor-immune networks in FL.

Subject(s)

Clonal Evolution/genetics , Lymphoma, Follicular/pathology , Single-Cell Analysis , Adult , Aged , Antigens, Neoplasm/biosynthesis , Antigens, Neoplasm/genetics , Biopsy, Fine-Needle , CD40 Antigens/biosynthesis , CD40 Antigens/genetics , CD40 Ligand/biosynthesis , CD40 Ligand/genetics , DNA, Neoplasm/genetics , Disease Progression , Female , Flow Cytometry , Gene Rearrangement, B-Lymphocyte, Light Chain , Gene Rearrangement, T-Lymphocyte , Humans , Lymph Nodes/chemistry , Lymph Nodes/ultrastructure , Lymphocytes, Tumor-Infiltrating/immunology , Lymphoma, Follicular/chemistry , Lymphoma, Follicular/genetics , Male , Middle Aged , Neoplasm Proteins/biosynthesis , Neoplasm Proteins/genetics , Phylogeny , RNA, Neoplasm/genetics , Sequence Alignment , Sequence Homology, Nucleic Acid , T Follicular Helper Cells/immunology , T Follicular Helper Cells/metabolism , Transcriptome , Tumor Microenvironment

5.

New approaches to moderate CRISPR-Cas9 activity: Addressing issues of cellular uptake and endosomal escape.

van Hees, Maja; Slott, Sofie; Hansen, Anders Højgaard; Kim, Heon Seok; Ji, Hanlee P; Astakhova, Kira.

Mol Ther ; 30(1): 32-46, 2022 01 05.

Article in English | MEDLINE | ID: mdl-34091053

ABSTRACT

CRISPR-Cas9 is rapidly entering molecular biology and biomedicine as a promising gene-editing tool. A unique feature of CRISPR-Cas9 is a single-guide RNA directing a Cas9 nuclease toward its genomic target. Herein, we highlight new approaches for improving cellular uptake and endosomal escape of CRISPR-Cas9. As opposed to other recently published works, this review is focused on non-viral carriers as a means to facilitate the cellular uptake of CRISPR-Cas9 through endocytosis. The majority of non-viral carriers, such as gold nanoparticles, polymer nanoparticles, lipid nanoparticles, and nanoscale zeolitic imidazole frameworks, is developed with a focus toward optimizing the endosomal escape of CRISPR-Cas9 by taking advantage of the acidic environment in the late endosomes. Among the most broadly used methods for in vitro and ex vivo ribonucleotide protein transfection are electroporation and microinjection. Thus, other delivery formats are warranted for in vivo delivery of CRISPR-Cas9. Herein, we specifically revise the use of peptide and nanoparticle-based systems as platforms for CRISPR-Cas9 delivery in vivo. Finally, we highlight future perspectives of the CRISPR-Cas9 gene-editing tool and the prospects of using non-viral vectors to improve its bioavailability and therapeutic potential.

Subject(s)

CRISPR-Cas Systems , Metal Nanoparticles , Endosomes/metabolism , Gene Editing/methods , Gold/metabolism , Liposomes , Nanoparticles

6.

Comprehensive, integrated, and phased whole-genome analysis of the primary ENCODE cell line K562.

Zhou, Bo; Ho, Steve S; Greer, Stephanie U; Zhu, Xiaowei; Bell, John M; Arthur, Joseph G; Spies, Noah; Zhang, Xianglong; Byeon, Seunggyu; Pattni, Reenal; Ben-Efraim, Noa; Haney, Michael S; Haraksingh, Rajini R; Song, Giltae; Ji, Hanlee P; Perrin, Dimitri; Wong, Wing H; Abyzov, Alexej; Urban, Alexander E.

Genome Res ; 29(3): 472-484, 2019 03.

Article in English | MEDLINE | ID: mdl-30737237

ABSTRACT

K562 is widely used in biomedical research. It is one of three tier-one cell lines of ENCODE and also most commonly used for large-scale CRISPR/Cas9 screens. Although its functional genomic and epigenomic characteristics have been extensively studied, its genome sequence and genomic structural features have never been comprehensively analyzed. Such information is essential for the correct interpretation and understanding of the vast troves of existing functional genomics and epigenomics data for K562. We performed and integrated deep-coverage whole-genome (short-insert), mate-pair, and linked-read sequencing as well as karyotyping and array CGH analysis to identify a wide spectrum of genome characteristics in K562: copy numbers (CN) of aneuploid chromosome segments at high-resolution, SNVs and indels (both corrected for CN in aneuploid regions), loss of heterozygosity, megabase-scale phased haplotypes often spanning entire chromosome arms, structural variants (SVs), including small and large-scale complex SVs and nonreference retrotransposon insertions. Many SVs were phased, assembled, and experimentally validated. We identified multiple allele-specific deletions and duplications within the tumor suppressor gene FHIT Taking aneuploidy into account, we reanalyzed K562 RNA-seq and whole-genome bisulfite sequencing data for allele-specific expression and allele-specific DNA methylation. We also show examples of how deeper insights into regulatory complexity are gained by integrating genomic variant information and structural context with functional genomics and epigenomics data. Furthermore, using K562 haplotype information, we produced an allele-specific CRISPR targeting map. This comprehensive whole-genome analysis serves as a resource for future studies that utilize K562 as well as a framework for the analysis of other cancer genomes.

Subject(s)

Genome, Human , Humans , K562 Cells , Karyotype , Polymorphism, Genetic , Whole Genome Sequencing

7.

Single-cell RNA-Seq of follicular lymphoma reveals malignant B-cell types and coexpression of T-cell immune checkpoints.

Andor, Noemi; Simonds, Erin F; Czerwinski, Debra K; Chen, Jiamin; Grimes, Susan M; Wood-Bouwens, Christina; Zheng, Grace X Y; Kubit, Matthew A; Greer, Stephanie; Weiss, William A; Levy, Ronald; Ji, Hanlee P.

Blood ; 133(10): 1119-1129, 2019 03 07.

Article in English | MEDLINE | ID: mdl-30591526

ABSTRACT

Follicular lymphoma (FL) is a low-grade B-cell malignancy that transforms into a highly aggressive and lethal disease at a rate of 2% per year. Perfect isolation of the malignant B-cell population from a surgical biopsy is a significant challenge, masking important FL biology, such as immune checkpoint coexpression patterns. To resolve the underlying transcriptional networks of follicular B-cell lymphomas, we analyzed the transcriptomes of 34 188 cells derived from 6 primary FL tumors. For each tumor, we identified normal immune subpopulations and malignant B cells, based on gene expression. We used multicolor flow cytometry analysis of the same tumors to confirm our assignments of cellular lineages and validate our predictions of expressed proteins. Comparison of gene expression between matched malignant and normal B cells from the same patient revealed tumor-specific features. Malignant B cells exhibited restricted immunoglobulin (Ig) light chain expression (either Igκ or Igλ), as well the expected upregulation of the BCL2 gene, but also downregulation of the FCER2, CD52, and major histocompatibility complex class II genes. By analyzing thousands of individual cells per patient tumor, we identified the mosaic of malignant B-cell subclones that coexist within a FL and examined the characteristics of tumor-infiltrating T cells. We identified genes coexpressed with immune checkpoint molecules, such as CEBPA and B2M in regulatory T (Treg) cells, providing a better understanding of the gene networks involved in immune regulation. In summary, parallel measurement of single-cell expression in thousands of tumor cells and tumor-infiltrating lymphocytes can be used to obtain a systems-level view of the tumor microenvironment and identify new avenues for therapeutic development.

Subject(s)

Lymphoma, B-Cell/genetics , Lymphoma, Follicular/genetics , T-Lymphocytes, Regulatory/cytology , Biopsy , CCAAT-Enhancer-Binding Proteins/genetics , CD4-Positive T-Lymphocytes/cytology , CD52 Antigen/genetics , Cell Lineage , Flow Cytometry , Gene Expression Profiling , Gene Expression Regulation, Leukemic , Hematopoietic Stem Cells/cytology , Histocompatibility Antigens Class II/metabolism , Humans , Immune System , Immunoglobulin G , Lectins, C-Type/genetics , Leukocytes, Mononuclear/cytology , Lymphoma, B-Cell/blood , Lymphoma, Follicular/blood , Palatine Tonsil/metabolism , Receptors, IgE/genetics , Sequence Analysis, RNA , Transcriptome , Tumor Microenvironment , beta 2-Microglobulin/genetics

8.

Targeted short read sequencing and assembly of re-arrangements and candidate gene loci provide megabase diplotypes.

Shin, GiWon; Greer, Stephanie U; Xia, Li C; Lee, HoJoon; Zhou, Jun; Boles, T Christian; Ji, Hanlee P.

Nucleic Acids Res ; 47(19): e115, 2019 11 04.

Article in English | MEDLINE | ID: mdl-31350896

ABSTRACT

The human genome is composed of two haplotypes, otherwise called diplotypes, which denote phased polymorphisms and structural variations (SVs) that are derived from both parents. Diplotypes place genetic variants in the context of cis-related variants from a diploid genome. As a result, they provide valuable information about hereditary transmission, context of SV, regulation of gene expression and other features which are informative for understanding human genetics. Successful diplotyping with short read whole genome sequencing generally requires either a large population or parent-child trio samples. To overcome these limitations, we developed a targeted sequencing method for generating megabase (Mb)-scale haplotypes with short reads. One selects specific 0.1-0.2 Mb high molecular weight DNA targets with custom-designed Cas9-guide RNA complexes followed by sequencing with barcoded linked reads. To test this approach, we designed three assays, targeting the BRCA1 gene, the entire 4-Mb major histocompatibility complex locus and 18 well-characterized SVs, respectively. Using an integrated alignment- and assembly-based approach, we generated comprehensive variant diplotypes spanning the entirety of the targeted loci and characterized SVs with exact breakpoints. Our results were comparable in quality to long read sequencing.

Subject(s)

Genome, Human/genetics , Genomic Structural Variation/genetics , High-Throughput Nucleotide Sequencing/methods , Whole Genome Sequencing/methods , Diploidy , Gene Expression Regulation/genetics , Genetic Association Studies/methods , Haplotypes/genetics , Humans , Sequence Analysis, DNA/methods

9.

Haplotype-resolved and integrated genome analysis of the cancer cell line HepG2.

Zhou, Bo; Ho, Steve S; Greer, Stephanie U; Spies, Noah; Bell, John M; Zhang, Xianglong; Zhu, Xiaowei; Arthur, Joseph G; Byeon, Seunggyu; Pattni, Reenal; Saha, Ishan; Huang, Yiling; Song, Giltae; Perrin, Dimitri; Wong, Wing H; Ji, Hanlee P; Abyzov, Alexej; Urban, Alexander E.

Nucleic Acids Res ; 47(8): 3846-3861, 2019 05 07.

Article in English | MEDLINE | ID: mdl-30864654

ABSTRACT

HepG2 is one of the most widely used human cancer cell lines in biomedical research and one of the main cell lines of ENCODE. Although the functional genomic and epigenomic characteristics of HepG2 are extensively studied, its genome sequence has never been comprehensively analyzed and higher order genomic structural features are largely unknown. The high degree of aneuploidy in HepG2 renders traditional genome variant analysis methods challenging and partially ineffective. Correct and complete interpretation of the extensive functional genomics data from HepG2 requires an understanding of the cell line's genome sequence and genome structure. Using a variety of sequencing and analysis methods, we identified a wide spectrum of genome characteristics in HepG2: copy numbers of chromosomal segments at high resolution, SNVs and Indels (corrected for aneuploidy), regions with loss of heterozygosity, phased haplotypes extending to entire chromosome arms, retrotransposon insertions and structural variants (SVs) including complex and somatic genomic rearrangements. A large number of SVs were phased, sequence assembled and experimentally validated. We re-analyzed published HepG2 datasets for allele-specific expression and DNA methylation and assembled an allele-specific CRISPR/Cas9 targeting map. We demonstrate how deeper insights into genomic regulatory complexity are gained by adopting a genome-integrated framework.

Subject(s)

Chromosome Mapping/methods , Genome, Human , Genomics/methods , Haplotypes , Sequence Analysis, DNA/statistics & numerical data , Alleles , Aneuploidy , DNA Methylation , Genomic Structural Variation , Hep G2 Cells , High-Throughput Nucleotide Sequencing , Humans , INDEL Mutation , Karyotyping , Loss of Heterozygosity , Polymorphism, Single Nucleotide , Retroelements

10.

Ultra-fast detection and quantification of nucleic acids by amplification-free fluorescence assay.

Uhd, Jesper; Miotke, Laura; Ji, Hanlee P; Dunaeva, Marina; Pruijn, Ger J M; Jørgensen, Christian Damsgaard; Kristoffersen, Emil Laust; Birkedal, Victoria; Yde, Christina Westmose; Nielsen, Finn Cilius; Hansen, Jonas; Astakhova, Kira.

Analyst ; 145(17): 5836-5844, 2020 Aug 24.

Article in English | MEDLINE | ID: mdl-32648858

ABSTRACT

Two types of clinically important nucleic acid biomarkers, microRNA (miRNA) and circulating tumor DNA (ctDNA) were detected and quantified from human serum using an amplification-free fluorescence hybridization assay. Specifically, miRNAs hsa-miR-223-3p and hsa-miR-486-5p with relevance for rheumatoid arthritis and cancer related mutations BRAF and KRAS of ctDNA were directly measured. The required oligonucleotide probes for the assay were rationally designed and synthesized through a novel "clickable" approach which is time and cost-effective. With no need for isolating nucleic acid components from serum, the fluoresence-based assay took only 1 hour. Detection and absolute quantification of targets was successfully achieved despite their notoriously low abundance, with a precision down to individual nucleotides. Obtained miRNA and ctDNA amounts showed overall a good correlation with current techniques. With appropriate probes, our novel assay and signal boosting approach could become a useful tool for point-of-care measuring other low abundance nucleic acid biomarkers.

Subject(s)

Circulating Tumor DNA , MicroRNAs , Nucleic Acids , Biomarkers , Humans , MicroRNAs/genetics , Nucleic Acid Hybridization

11.

Identification of large rearrangements in cancer genomes with barcode linked reads.

Xia, Li C; Bell, John M; Wood-Bouwens, Christina; Chen, Jiamin J; Zhang, Nancy R; Ji, Hanlee P.

Nucleic Acids Res ; 46(4): e19, 2018 02 28.

Article in English | MEDLINE | ID: mdl-29186506

ABSTRACT

Large genomic rearrangements involve inversions, deletions and other structural changes that span Megabase segments of the human genome. This category of genetic aberration is the cause of many hereditary genetic disorders and contributes to pathogenesis of diseases like cancer. We developed a new algorithm called ZoomX for analysing barcode-linked sequence reads-these sequences can be traced to individual high molecular weight DNA molecules (>50 kb). To generate barcode linked sequence reads, we employ a library preparation technology (10X Genomics) that uses droplets to partition and barcode DNA molecules. Using linked read data from whole genome sequencing, we identify large genomic rearrangements, typically greater than 200kb, even when they are only present in low allelic fractions. Our algorithm uses a Poisson scan statistic to identify genomic rearrangement junctions, determine counts of junction-spanning molecules and calculate a Fisher's exact test for determining statistical significance for somatic aberrations. Utilizing a well-characterized human genome, we benchmarked this approach to accurately identify large rearrangement. Subsequently, we demonstrated that our algorithm identifies somatic rearrangements when present in lower allelic fractions as occurs in tumors. We characterized a set of complex cancer rearrangements with multiple classes of structural aberrations and with possible roles in oncogenesis.

Subject(s)

Genomic Structural Variation , Neoplasms/genetics , Whole Genome Sequencing/methods , Algorithms , Chromosome Aberrations , Gastrointestinal Neoplasms/genetics , Genome, Human , Humans

12.

Covalent "Click Chemistry"-Based Attachment of DNA onto Solid Phase Enables Iterative Molecular Analysis.

Lau, Billy T; Ji, Hanlee P.

Anal Chem ; 91(3): 1706-1710, 2019 02 05.

Article in English | MEDLINE | ID: mdl-30652472

ABSTRACT

Molecular analysis of DNA samples with limited quantities can be challenging. Repeatedly sequencing the original DNA molecules from a given sample would overcome many issues related to accurate genetic analysis and mitigate issues with processing small amounts of DNA analyte. Moreover, an iterative, replicated analysis of the same DNA molecule has the potential to improve genetic characterization. Herein, we demonstrate that the use of "click"-based attachment of DNA sequencing libraries onto an agarose bead support enables repetitive primer extension assays for specific genomic DNA targets such as gene exons. We validated the performance of this assay for evaluating specific genetic alterations in both normal and cancer reference standard DNA samples. We demonstrate the stability of conjugated DNA libraries and related sequencing results over the course of independent serial assays spanning several months from the same set of samples. Finally, we finally applied this method to DNA derived from a tumor sample and demonstrated improved mutation detection accuracy.

Subject(s)

DNA, Neoplasm/analysis , High-Throughput Nucleotide Sequencing/methods , Cell Line, Tumor , Click Chemistry , Cycloaddition Reaction , DNA, Neoplasm/chemistry , DNA, Neoplasm/genetics , Gene Library , Humans , Mutation , Neoplasms/genetics , Proof of Concept Study , Sepharose/chemistry

13.

Chromosome-scale mega-haplotypes enable digital karyotyping of cancer aneuploidy.

Bell, John M; Lau, Billy T; Greer, Stephanie U; Wood-Bouwens, Christina; Xia, Li C; Connolly, Ian D; Gephart, Melanie H; Ji, Hanlee P.

Nucleic Acids Res ; 45(19): e162, 2017 Nov 02.

Article in English | MEDLINE | ID: mdl-28977555

ABSTRACT

Genomic instability is a frequently occurring feature of cancer that involves large-scale structural alterations. These somatic changes in chromosome structure include duplication of entire chromosome arms and aneuploidy where chromosomes are duplicated beyond normal diploid content. However, the accurate determination of aneuploidy events in cancer genomes is a challenge. Recent advances in sequencing technology allow the characterization of haplotypes that extend megabases along the human genome using high molecular weight (HMW) DNA. For this study, we employed a library preparation method in which sequence reads have barcodes linked to single HMW DNA molecules. Barcode-linked reads are used to generate extended haplotypes on the order of megabases. We developed a method that leverages haplotypes to identify chromosomal segmental alterations in cancer and uses this information to join haplotypes together, thus extending the range of phased variants. With this approach, we identified mega-haplotypes that encompass entire chromosome arms. We characterized the chromosomal arm changes and aneuploidy events in a manner that offers similar information as a traditional karyotype but with the benefit of DNA sequence resolution. We applied this approach to characterize aneuploidy and chromosomal alterations from a series of primary colorectal cancers.

Subject(s)

Aneuploidy , Haplotypes , Neoplasms/genetics , Chromosome Aberrations , Colorectal Neoplasms/diagnosis , Colorectal Neoplasms/genetics , DNA Mutational Analysis/methods , Genome, Human/genetics , Genomic Instability , High-Throughput Nucleotide Sequencing/methods , Humans , Karyotype , Karyotyping/methods , Neoplasms/diagnosis , Reproducibility of Results , Sensitivity and Specificity

14.

A genome-wide approach for detecting novel insertion-deletion variants of mid-range size.

Xia, Li C; Sakshuwong, Sukolsak; Hopmans, Erik S; Bell, John M; Grimes, Susan M; Siegmund, David O; Ji, Hanlee P; Zhang, Nancy R.

Nucleic Acids Res ; 44(15): e126, 2016 09 06.

Article in English | MEDLINE | ID: mdl-27325742

ABSTRACT

We present SWAN, a statistical framework for robust detection of genomic structural variants in next-generation sequencing data and an analysis of mid-range size insertion and deletions (<10 Kb) for whole genome analysis and DNA mixtures. To identify these mid-range size events, SWAN collectively uses information from read-pair, read-depth and one end mapped reads through statistical likelihoods based on Poisson field models. SWAN also uses soft-clip/split read remapping to supplement the likelihood analysis and determine variant boundaries. The accuracy of SWAN is demonstrated by in silico spike-ins and by identification of known variants in the NA12878 genome. We used SWAN to identify a series of novel set of mid-range insertion/deletion detection that were confirmed by targeted deep re-sequencing. An R package implementation of SWAN is open source and freely available.

Subject(s)

DNA Mutational Analysis/methods , Genome/genetics , Genomics/methods , INDEL Mutation/genetics , Adenoviridae/genetics , Algorithms , Animals , Benchmarking , Computer Simulation , Datasets as Topic , Pan troglodytes/virology , Poisson Distribution , Reproducibility of Results

15.

Single molecule counting and assessment of random molecular tagging errors with transposable giga-scale error-correcting barcodes.

Lau, Billy T; Ji, Hanlee P.

BMC Genomics ; 18(1): 745, 2017 Sep 21.

Article in English | MEDLINE | ID: mdl-28934929

ABSTRACT

BACKGROUND: RNA-Seq measures gene expression by counting sequence reads belonging to unique cDNA fragments. Molecular barcodes commonly in the form of random nucleotides were recently introduced to improve gene expression measures by detecting amplification duplicates, but are susceptible to errors generated during PCR and sequencing. This results in false positive counts, leading to inaccurate transcriptome quantification especially at low input and single-cell RNA amounts where the total number of molecules present is minuscule. To address this issue, we demonstrated the systematic identification of molecular species using transposable error-correcting barcodes that are exponentially expanded to tens of billions of unique labels. RESULTS: We experimentally showed random-mer molecular barcodes suffer from substantial and persistent errors that are difficult to resolve. To assess our method's performance, we applied it to the analysis of known reference RNA standards. By including an inline random-mer molecular barcode, we systematically characterized the presence of sequence errors in random-mer molecular barcodes. We observed that such errors are extensive and become more dominant at low input amounts. CONCLUSIONS: We described the first study to use transposable molecular barcodes and its use for studying random-mer molecular barcode errors. Extensive errors found in random-mer molecular barcodes may warrant the use of error correcting barcodes for transcriptome analysis as input amounts decrease.

Subject(s)

DNA Transposable Elements/genetics , Sequence Analysis, RNA/methods , DNA, Complementary/genetics , Gene Expression Profiling

16.

Robust Multiplexed Clustering and Denoising of Digital PCR Assays by Data Gridding.

Lau, Billy T; Wood-Bouwens, Christina; Ji, Hanlee P.

Anal Chem ; 89(22): 11913-11917, 2017 11 21.

Article in English | MEDLINE | ID: mdl-29083143

ABSTRACT

Digital PCR (dPCR) relies on the analysis of individual partitions to accurately quantify nucleic acid species. The most widely used analysis method requires manual clustering through individual visual inspection. Some automated analysis methods have emerged but do not robustly account for multiplexed targets, low target concentration, and assay noise. In this study, we describe an open source analysis software called Calico that uses "data gridding" to increase the sensitivity of clustering toward small clusters. Our workflow also generates quality score metrics in order to gauge and filter individual assay partitions by how well they were classified. We applied our analysis algorithm to multiplexed droplet-based digital PCR data sets in both EvaGreen and probes-based schemes, and targeted the oncogenic BRAF V600E and KRAS G12D mutations. We demonstrate an automated clustering sensitivity of down to 0.1% mutant fraction and filtering of artifactual assay partitions from low quality DNA samples. Overall, we demonstrate a vastly improved approach to analyzing ddPCR data that can be applied to clinical use, where automation and reproducibility are critical.

Subject(s)

Polymerase Chain Reaction/methods , Polymerase Chain Reaction/standards , Automation , Cluster Analysis , Humans , Mutation , Proto-Oncogene Proteins B-raf/genetics , Proto-Oncogene Proteins p21(ras)/genetics , Software

17.

Tandem Oligonucleotide Probe Annealing and Elongation To Discriminate Viral Sequence.

Taskova, Maria; Uhd, Jesper; Miotke, Laura; Kubit, Matthew; Bell, John; Ji, Hanlee P; Astakhova, Kira.

Anal Chem ; 89(8): 4363-4366, 2017 04 18.

Article in English | MEDLINE | ID: mdl-28382823

ABSTRACT

New approaches for genomic DNA/RNA detection are in high demand in order to provide controls for existing enzymatic technologies and to create alternatives for emerging applications. In particular, there is an unmet need in rapid, reliable detection of short RNA regions which could open up new opportunities in transcriptome analysis, virology, and other fields. Herein, we report for the first time a "click" chemistry approach to oligonucleotide probe elongation as a novel approach to specifically detect a viral sequence. We hybridized a library of short, terminally labeled probes to Ebola virus RNA followed by click assembly and analysis of the read sequence by various techniques. As we demonstrate in this paper, using our new approach, a viral RNA sequence can be detected in less than 2 h without the need for cDNA synthesis or any other enzymatic reactions and with a sensitivity of <10 pM target RNA.

Subject(s)

Ebolavirus/genetics , Oligonucleotide Probes/metabolism , RNA, Viral/metabolism , Carbocyanines/chemistry , Click Chemistry , Discriminant Analysis , Nucleic Acid Hybridization , Oligonucleotide Probes/genetics , Oligonucleotides/chemistry , Polymorphism, Single Nucleotide , RNA, Viral/analysis

18.

Allele-specific copy number profiling by next-generation DNA sequencing.

Chen, Hao; Bell, John M; Zavala, Nicolas A; Ji, Hanlee P; Zhang, Nancy R.

Nucleic Acids Res ; 43(4): e23, 2015 Feb 27.

Article in English | MEDLINE | ID: mdl-25477383

ABSTRACT

The progression and clonal development of tumors often involve amplifications and deletions of genomic DNA. Estimation of allele-specific copy number, which quantifies the number of copies of each allele at each variant loci rather than the total number of chromosome copies, is an important step in the characterization of tumor genomes and the inference of their clonal history. We describe a new method, falcon, for finding somatic allele-specific copy number changes by next generation sequencing of tumors with matched normals. falcon is based on a change-point model on a bivariate mixed Binomial process, which explicitly models the copy numbers of the two chromosome haplotypes and corrects for local allele-specific coverage biases. By using the Binomial distribution rather than a normal approximation, falcon more effectively pools evidence from sites with low coverage. A modified Bayesian information criterion is used to guide model selection for determining the number of copy number events. Falcon is evaluated on in silico spike-in data and applied to the analysis of a pre-malignant colon tumor sample and late-stage colorectal adenocarcinoma from the same individual. The allele-specific copy number estimates obtained by falcon allows us to draw detailed conclusions regarding the clonal history of the individual's colon cancer.

Subject(s)

Alleles , Gene Dosage , High-Throughput Nucleotide Sequencing/methods , Neoplasms/genetics , Sequence Analysis, DNA/methods , Software , Adenocarcinoma/genetics , Clonal Evolution , Colorectal Neoplasms/genetics , Humans

19.

A programmable method for massively parallel targeted sequencing.

Hopmans, Erik S; Natsoulis, Georges; Bell, John M; Grimes, Susan M; Sieh, Weiva; Ji, Hanlee P.

Nucleic Acids Res ; 42(10): e88, 2014 Jun.

Article in English | MEDLINE | ID: mdl-24782526

ABSTRACT

We have developed a targeted resequencing approach referred to as Oligonucleotide-Selective Sequencing. In this study, we report a series of significant improvements and novel applications of this method whereby the surface of a sequencing flow cell is modified in situ to capture specific genomic regions of interest from a sample and then sequenced. These improvements include a fully automated targeted sequencing platform through the use of a standard Illumina cBot fluidics station. Targeting optimization increased the yield of total on-target sequencing data 2-fold compared to the previous iteration, while simultaneously increasing the percentage of reads that could be mapped to the human genome. The described assays cover up to 1421 genes with a total coverage of 5.5 Megabases (Mb). We demonstrate a 10-fold abundance uniformity of greater than 90% in 1 log distance from the median and a targeting rate of up to 95%. We also sequenced continuous genomic loci up to 1.5 Mb while simultaneously genotyping SNPs and genes. Variants with low minor allele fraction were sensitively detected at levels of 5%. Finally, we determined the exact breakpoint sequence of cancer rearrangements. Overall, this approach has high performance for selective sequencing of genome targets, configuration flexibility and variant calling accuracy.

Subject(s)

High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Alleles , Chromosome Breakpoints , DNA Primers , Genome, Human , Genomic Structural Variation , Genomics/methods , Humans , Mutation , Neoplasms/genetics , Polymorphism, Single Nucleotide

20.

MendeLIMS: a web-based laboratory information management system for clinical genome sequencing.

Grimes, Susan M; Ji, Hanlee P.

BMC Bioinformatics ; 15: 290, 2014 Aug 27.

Article in English | MEDLINE | ID: mdl-25159034

ABSTRACT

BACKGROUND: Large clinical genomics studies using next generation DNA sequencing require the ability to select and track samples from a large population of patients through many experimental steps. With the number of clinical genome sequencing studies increasing, it is critical to maintain adequate laboratory information management systems to manage the thousands of patient samples that are subject to this type of genetic analysis. RESULTS: To meet the needs of clinical population studies using genome sequencing, we developed a web-based laboratory information management system (LIMS) with a flexible configuration that is adaptable to continuously evolving experimental protocols of next generation DNA sequencing technologies. Our system is referred to as MendeLIMS, is easily implemented with open source tools and is also highly configurable and extensible. MendeLIMS has been invaluable in the management of our clinical genome sequencing studies. CONCLUSIONS: We maintain a publicly available demonstration version of the application for evaluation purposes at http://mendelims.stanford.edu. MendeLIMS is programmed in Ruby on Rails (RoR) and accesses data stored in SQL-compliant relational databases. Software is freely available for non-commercial use at http://dna-discovery.stanford.edu/software/mendelims/.

Subject(s)

Clinical Laboratory Techniques , Genomics/methods , Health Information Management/methods , Internet , Medical Informatics/methods , Sequence Analysis, DNA/methods , Software , Databases, Genetic , Gene Library , Genetic Testing , High-Throughput Nucleotide Sequencing , Humans , User-Computer Interface

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL