Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 10 de 10
Filter
1.
BMC Genomics ; 22(1): 689, 2021 Sep 22.
Article in English | MEDLINE | ID: mdl-34551708

ABSTRACT

BACKGROUND: Recent studies have demonstrated the utility of scRNA-seq SNVs to distinguish tumor from normal cells, characterize intra-tumoral heterogeneity, and define mutation-associated expression signatures. In addition to cancer studies, SNVs from single cells have been useful in studies of transcriptional burst kinetics, allelic expression, chromosome X inactivation, ploidy estimations, and haplotype inference. RESULTS: To aid these types of studies, we have developed a tool, SCReadCounts, for cell-level tabulation of the sequencing read counts bearing SNV reference and variant alleles from barcoded scRNA-seq alignments. Provided genomic loci and expected alleles, SCReadCounts generates cell-SNV matrices with the absolute variant- and reference-harboring read counts, as well as cell-SNV matrices of expressed Variant Allele Fraction (VAFRNA) suitable for a variety of downstream applications. We demonstrate three different SCReadCounts applications on 59,884 cells from seven neuroblastoma samples: (1) estimation of cell-level expression of known somatic mutations and RNA-editing sites, (2) estimation of cell- level allele expression of biallelic SNVs, and (3) a discovery mode assessment of the reference and each of the three alternative nucleotides at genomic positions of interest that does not require prior SNV information. For the later, we applied SCReadCounts on the coding regions of KRAS, where it identified known and novel somatic mutations in a low-to-moderate proportion of cells. The SCReadCounts read counts module is benchmarked against the analogous modules of GATK and Samtools. SCReadCounts is freely available ( https://github.com/HorvathLab/NGS ) as 64-bit self-contained binary distributions for Linux and MacOS, in addition to Python source. CONCLUSIONS: SCReadCounts supplies a fast and efficient solution for estimation of cell-level SNV expression from scRNA-seq data. SCReadCounts enables distinguishing cells with monoallelic reference expression from those with no gene expression and is applicable to assess SNVs present in only a small proportion of the cells, such as somatic mutations in cancer.


Subject(s)
RNA, Small Cytoplasmic , Polymorphism, Single Nucleotide , RNA , Sequence Analysis, RNA , Single-Cell Analysis , Software
2.
BMC Genomics ; 22(1): 40, 2021 Jan 08.
Article in English | MEDLINE | ID: mdl-33419390

ABSTRACT

BACKGROUND: Recently, pioneering expression quantitative trait loci (eQTL) studies on single cell RNA sequencing (scRNA-seq) data have revealed new and cell-specific regulatory single nucleotide variants (SNVs). Here, we present an alternative QTL-related approach applicable to transcribed SNV loci from scRNA-seq data: scReQTL. ScReQTL uses Variant Allele Fraction (VAFRNA) at expressed biallelic loci, and corelates it to gene expression from the corresponding cell. RESULTS: Our approach employs the advantage that, when estimated from multiple cells, VAFRNA can be used to assess effects of SNVs in a single sample or individual. In this setting scReQTL operates in the context of identical genotypes, where it is likely to capture RNA-mediated genetic interactions with cell-specific and transient effects. Applying scReQTL on scRNA-seq data generated on the 10 × Genomics Chromium platform using 26,640 mesenchymal cells derived from adipose tissue obtained from three healthy female donors, we identified 1272 unique scReQTLs. ScReQTLs common between individuals or cell types were consistent in terms of the directionality of the relationship and the effect size. Comparative assessment with eQTLs from bulk sequencing data showed that scReQTL analysis identifies a distinct set of SNV-gene correlations, that are substantially enriched in known gene-gene interactions and significant genome-wide association studies (GWAS) loci. CONCLUSION: ScReQTL is relevant to the rapidly growing source of scRNA-seq data and can be applied to outline SNVs potentially contributing to cell type-specific and/or dynamic genetic interactions from an individual scRNA-seq dataset. AVAILABILITY: https://github.com/HorvathLab/NGS/tree/master/scReQTL.


Subject(s)
RNA, Small Cytoplasmic , Female , Gene Expression , Gene Expression Profiling , Genome-Wide Association Study , Humans , Sequence Analysis, RNA , Single-Cell Analysis , Software
3.
Bioinformatics ; 36(5): 1351-1359, 2020 03 01.
Article in English | MEDLINE | ID: mdl-31589315

ABSTRACT

MOTIVATION: By testing for associations between DNA genotypes and gene expression levels, expression quantitative trait locus (eQTL) analyses have been instrumental in understanding how thousands of single nucleotide variants (SNVs) may affect gene expression. As compared to DNA genotypes, RNA genetic variation represents a phenotypic trait that reflects the actual allele content of the studied system. RNA genetic variation at expressed SNV loci can be estimated using the proportion of alleles bearing the variant nucleotide (variant allele fraction, VAFRNA). VAFRNA is a continuous measure which allows for precise allele quantitation in loci where the RNA alleles do not scale with the genotype count. We describe a method to correlate VAFRNA with gene expression and assess its ability to identify genetically regulated expression solely from RNA-sequencing (RNA-seq) datasets. RESULTS: We introduce ReQTL, an eQTL modification which substitutes the DNA allele count for the variant allele fraction at expressed SNV loci in the transcriptome (VAFRNA). We exemplify the method on sets of RNA-seq data from human tissues obtained though the Genotype-Tissue Expression (GTEx) project and demonstrate that ReQTL analyses are computationally feasible and can identify a subset of expressed eQTL loci. AVAILABILITY AND IMPLEMENTATION: A toolkit to perform ReQTL analyses is available at https://github.com/HorvathLab/ReQTL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
RNA , Software , Humans , Nucleotides , Polymorphism, Single Nucleotide , Quantitative Trait Loci , Sequence Analysis, RNA
4.
Nucleic Acids Res ; 44(22): e161, 2016 12 15.
Article in English | MEDLINE | ID: mdl-27576531

ABSTRACT

We introduce RNA2DNAlign, a computational framework for quantitative assessment of allele counts across paired RNA and DNA sequencing datasets. RNA2DNAlign is based on quantitation of the relative abundance of variant and reference read counts, followed by binomial tests for genotype and allelic status at SNV positions between compatible sequences. RNA2DNAlign detects positions with differential allele distribution, suggesting asymmetries due to regulatory/structural events. Based on the type of asymmetry, RNA2DNAlign outlines positions likely to be implicated in RNA editing, allele-specific expression or loss, somatic mutagenesis or loss-of-heterozygosity (the first three also in a tumor-specific setting). We applied RNA2DNAlign on 360 matching normal and tumor exomes and transcriptomes from 90 breast cancer patients from TCGA. Under high-confidence settings, RNA2DNAlign identified 2038 distinct SNV sites associated with one of the aforementioned asymetries, the majority of which have not been linked to functionality before. The performance assessment shows very high specificity and sensitivity, due to the corroboration of signals across multiple matching datasets. RNA2DNAlign is freely available from http://github.com/HorvathLab/NGS as a self-contained binary package for 64-bit Linux systems.


Subject(s)
Sequence Analysis, DNA , Sequence Analysis, RNA , Software , Algorithms , Alleles , Breast Neoplasms/genetics , Breast Neoplasms/metabolism , Exome , Female , Gene Expression Profiling , High-Throughput Nucleotide Sequencing , Humans , Loss of Heterozygosity , Polymorphism, Single Nucleotide , RNA Editing , Sensitivity and Specificity , Transcriptome
5.
bioRxiv ; 2024 Jun 03.
Article in English | MEDLINE | ID: mdl-38895201

ABSTRACT

Transposable elements (TEs) are abundant in the human genome, and they provide the sources for genetic and functional diversity. The regulation of TEs expression and their functional consequences in physiological conditions and cancer development remain to be fully elucidated. Previous studies suggested TEs are repressed by DNA methylation and chromatin modifications. The effect of 3D chromatin topology on TE regulation remains elusive. Here, by integrating transcriptome and 3D genome architecture studies, we showed that haploinsufficient loss of NIPBL selectively activates alternative promoters at the long terminal repeats (LTRs) of the TE subclasses. This activation occurs through the reorganization of topologically associating domain (TAD) hierarchical structures and recruitment of proximal enhancers. These observations indicate that TAD hierarchy restricts transcriptional activation of LTRs that already possess open chromatin features. In cancer, perturbation of the hierarchical chromatin topology can lead to co-option of LTRs as functional alternative promoters in a context-dependent manner and drive aberrant transcriptional activation of novel oncogenes and other divergent transcripts. These data uncovered a new layer of regulatory mechanism of TE expression beyond DNA and chromatin modification in human genome. They also posit the TAD hierarchy dysregulation as a novel mechanism for alternative promoter-mediated oncogene activation and transcriptional diversity in cancer, which may be exploited therapeutically.

6.
Front Bioeng Biotechnol ; 8: 1021, 2020.
Article in English | MEDLINE | ID: mdl-33042959

ABSTRACT

Variant allele frequencies (VAF) are an important measure of genetic variation that can be estimated at single-nucleotide variant (SNV) sites. RNA and DNA VAFs are used as indicators of a wide-range of biological traits, including tumor purity and ploidy changes, allele-specific expression and gene-dosage transcriptional response. Here we present a novel methodology to assess gene and chromosomal allele asymmetries and to aid in identifying genomic alterations in RNA and DNA datasets. Our approach is based on analysis of the VAF distributions in chromosomal segments (continuous multi-SNV genomic regions). In each segment we estimate variant probability, a parameter of a random process that can generate synthetic VAF samples that closely resemble the observed data. We show that variant probability is a biologically interpretable quantitative descriptor of the VAF distribution in chromosomal segments which is consistent with other approaches. To this end, we apply the proposed methodology on data from 72 samples obtained from patients with breast invasive carcinoma (BRCA) from The Cancer Genome Atlas (TCGA). We compare DNA and RNA VAF distributions from matched RNA and whole exome sequencing (WES) datasets and find that both genomic signals give very similar segmentation and estimated variant probability profiles. We also find a correlation between variant probability with copy number alterations (CNA). Finally, to demonstrate a practical application of variant probabilities, we use them to estimate tumor purity. Tumor purity estimates based on variant probabilities demonstrate good concordance with other approaches (Pearson's correlation between 0.44 and 0.76). Our evaluation suggests that variant probabilities can serve as a dependable descriptor of VAF distribution, further enabling the statistical comparison of matched DNA and RNA datasets. Finally, they provide conceptual and mechanistic insights into relations between structure of VAF distributions and genetic events. The methodology is implemented in a Matlab toolbox that provides a suite of functions for analysis, statistical assessment and visualization of Genome and Transcriptome allele frequencies distributions. GeTallele is available at: https://github.com/SlowinskiPiotr/GeTallele.

7.
Genes (Basel) ; 11(3)2020 02 25.
Article in English | MEDLINE | ID: mdl-32106453

ABSTRACT

With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, the estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate the allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10×Genomics Chromium platform. We analyzed 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), sequenced to an average of 150K sequencing reads per cell (more than 4 billion scRNA-seq reads in total). High-quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimated the expressed variant allele fraction (VAFRNA) from SNV-aware alignments and analyzed its variance and distribution (mono- and bi-allelic) at different minimum sequencing read thresholds. Our analysis shows that when assessing positions covered by a minimum of three unique sequencing reads, over 50% of the heterozygous SNVs show bi-allelic expression, while at a threshold of 10 reads, nearly 90% of the SNVs are bi-allelic. In addition, our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3'-based library generation protocol of 10×Genomics scRNA-seq data can be informative in SNV-based studies, including analyses of transcriptional kinetics.


Subject(s)
Gene Expression Regulation/genetics , RNA/genetics , Single-Cell Analysis , Transcription, Genetic , Alleles , Exons/genetics , Genomics , Heterozygote , Humans , Introns/genetics , Polymorphism, Single Nucleotide/genetics , RNA-Seq , Software , Exome Sequencing
8.
Int J Nanomedicine ; 13: 199-208, 2018.
Article in English | MEDLINE | ID: mdl-29343958

ABSTRACT

PURPOSE: Anastrozole (ANS) is an aromatase inhibitor that is widely used as a treatment for breast cancer in postmenopausal women. Despite the wide use of ANS, it is associated with serious side effects due to uncontrolled delivery. In addition, ANS exhibits low solubility and short plasma half-life. Nanotechnology-based drug delivery has the potential to enhance the efficacy of drugs and overcome undesirable side effects. In this study, we aimed to prepare novel ANS-loaded PLA-PEG-PLA nanoparticles (ANS-NPs) and to compare the apoptotic response of MCF-7 cell line to both ANS and ANS-loaded NPs. METHOD: ANS-NPs were synthesized using double emulsion method and characterized using different methods. The apoptotic response was evaluated by assessing cell viability, morphology, and studying changes in the expression of MAPK3, MCL1, and c-MYC apoptotic genes in MCF-7 cell lines. RESULTS: ANS was successfully encapsulated within PLA-PEG-PLA, forming monodisperse therapeutic NPs with an encapsulation efficiency of 67%, particle size of 186±27.13, and a polydispersity index of 0.26±0.11 with a sustained release profile extended over 144 hours. In addition, results for cell viability and for gene expression represent a similar apoptotic response between the free ANS and ANS-NPs. CONCLUSION: The synthesized ANS-NPs showed a similar therapeutic effect as the free ANS, which provides a rationale to pursue pre-clinical evaluation of ANS-NPs on animal models.


Subject(s)
Apoptosis/drug effects , Aromatase Inhibitors/administration & dosage , Nanoparticles/administration & dosage , Nitriles/administration & dosage , Triazoles/administration & dosage , Anastrozole , Apoptosis/genetics , Aromatase Inhibitors/chemistry , Breast Neoplasms/drug therapy , Breast Neoplasms/genetics , Cell Survival/drug effects , Drug Delivery Systems/methods , Female , Gene Expression Regulation, Neoplastic , Half-Life , Humans , MCF-7 Cells , Mitogen-Activated Protein Kinase 3/genetics , Myeloid Cell Leukemia Sequence 1 Protein/genetics , Nanoparticles/chemistry , Nitriles/chemistry , Particle Size , Polyesters/chemistry , Polyethylene Glycols/chemistry , Proto-Oncogene Proteins c-myc/genetics , Solubility , Triazoles/chemistry
9.
Sci Rep ; 8(1): 7735, 2018 05 16.
Article in English | MEDLINE | ID: mdl-29769535

ABSTRACT

Imbalanced expression of somatic alleles in cancer can suggest functional and selective features, and can therefore indicate possible driving potential of the underlying genetic variants. To explore the correlation between allele frequency of somatic variants and total gene expression of their harboring gene, we used the unique data set of matched tumor and normal RNA and DNA sequencing data of 5523 distinct single nucleotide variants in 381 individuals across 10 cancer types obtained from The Cancer Genome Atlas (TCGA). We analyzed the allele frequency in the context of the variant and gene functional features and linked it with changes in the total gene expression. We documented higher allele frequency of somatic variants in cancer-implicated genes (Cancer Gene Census, CGC). Furthermore, somatic alleles bearing premature terminating variants (PTVs), when positioned in CGC genes, appeared to be less frequently degraded via nonsense-mediated mRNA decay, indicating possible favoring of truncated proteins by the tumor transcriptome. Among the genes with multiple PTVs with high allele frequency, ARID1, TP53 and NSD1 were known key cancer genes. All together, our analyses suggest that high allele frequency of tumor somatic variants can indicate driving functionality and can serve to identify potential cancer-implicated genes.


Subject(s)
Computational Biology/methods , Gene Expression Regulation, Neoplastic , Mutation , Neoplasm Proteins/genetics , Neoplasms/genetics , Polymorphism, Single Nucleotide , Transcriptome , Alleles , Gene Frequency , High-Throughput Nucleotide Sequencing , Humans
10.
Sci Rep ; 7(1): 8287, 2017 08 15.
Article in English | MEDLINE | ID: mdl-28811643

ABSTRACT

Asymmetric allele content in the transcriptome can be indicative of functional and selective features of the underlying genetic variants. Yet, imbalanced alleles, especially from diploid genome regions, are poorly explored in cancer. Here we systematically quantify and integrate the variant allele fraction from corresponding RNA and DNA sequence data from patients with breast cancer acquired through The Cancer Genome Atlas (TCGA). We test for correlation between allele prevalence and functionality in known cancer-implicated genes from the Cancer Gene Census (CGC). We document significant allele-preferential expression of functional variants in CGC genes and across the entire dataset. Notably, we find frequent allele-specific overexpression of variants in tumor-suppressor genes. We also report a list of over-expressed variants from non-CGC genes. Overall, our analysis presents an integrated set of features of somatic allele expression and points to the vast information content of the asymmetric alleles in the cancer transcriptome.


Subject(s)
Alleles , Breast Neoplasms/genetics , Gene Expression Regulation, Neoplastic , Response Elements , Female , Gene Expression Profiling , Genetic Variation , Genotype , Humans , Mutation , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL