Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 41
Filter
1.
Genome Biol ; 24(1): 140, 2023 06 19.
Article in English | MEDLINE | ID: mdl-37337297

ABSTRACT

BACKGROUND: In droplet-based single-cell and single-nucleus RNA-seq experiments, not all reads associated with one cell barcode originate from the encapsulated cell. Such background noise is attributed to spillage from cell-free ambient RNA or barcode swapping events. RESULTS: Here, we characterize this background noise exemplified by three scRNA-seq and two snRNA-seq replicates of mouse kidneys. For each experiment, cells from two mouse subspecies are pooled, allowing to identify cross-genotype contaminating molecules and thus profile background noise. Background noise is highly variable across replicates and cells, making up on average 3-35% of the total counts (UMIs) per cell and we find that noise levels are directly proportional to the specificity and detectability of marker genes. In search of the source of background noise, we find multiple lines of evidence that the majority of background molecules originates from ambient RNA. Finally, we use our genotype-based estimates to evaluate the performance of three methods (CellBender, DecontX, SoupX) that are designed to quantify and remove background noise. We find that CellBender provides the most precise estimates of background noise levels and also yields the highest improvement for marker gene detection. By contrast, clustering and classification of cells are fairly robust towards background noise and only small improvements can be achieved by background removal that may come at the cost of distortions in fine structure. CONCLUSIONS: Our findings help to better understand the extent, sources and impact of background noise in single-cell experiments and provide guidance on how to deal with it.


Subject(s)
RNA , Single-Cell Analysis , Animals , Mice , Sequence Analysis, RNA/methods , RNA-Seq/methods , RNA/genetics , Genotype , Single-Cell Analysis/methods , Gene Expression Profiling/methods , Cluster Analysis
2.
Elife ; 122023 03 22.
Article in English | MEDLINE | ID: mdl-36947129

ABSTRACT

Brain size and cortical folding have increased and decreased recurrently during mammalian evolution. Identifying genetic elements whose sequence or functional properties co-evolve with these traits can provide unique information on evolutionary and developmental mechanisms. A good candidate for such a comparative approach is TRNP1, as it controls proliferation of neural progenitors in mice and ferrets. Here, we investigate the contribution of both regulatory and coding sequences of TRNP1 to brain size and cortical folding in over 30 mammals. We find that the rate of TRNP1 protein evolution (ω) significantly correlates with brain size, slightly less with cortical folding and much less with body size. This brain correlation is stronger than for >95% of random control proteins. This co-evolution is likely affecting TRNP1 activity, as we find that TRNP1 from species with larger brains and more cortical folding induce higher proliferation rates in neural stem cells. Furthermore, we compare the activity of putative cis-regulatory elements (CREs) of TRNP1 in a massively parallel reporter assay and identify one CRE that likely co-evolves with cortical folding in Old World monkeys and apes. Our analyses indicate that coding and regulatory changes that increased TRNP1 activity were positively selected either as a cause or a consequence of increases in brain size and cortical folding. They also provide an example how phylogenetic approaches can inform biological mechanisms, especially when combined with molecular phenotypes across several species.


Subject(s)
Ferrets , Neural Stem Cells , Animals , Mice , Brain/metabolism , Cell Cycle Proteins/metabolism , DNA-Binding Proteins/metabolism , Neural Stem Cells/metabolism , Organ Size , Phylogeny
3.
Genome Biol ; 23(1): 88, 2022 03 31.
Article in English | MEDLINE | ID: mdl-35361256

ABSTRACT

Cost-efficient library generation by early barcoding has been central in propelling single-cell RNA sequencing. Here, we optimize and validate prime-seq, an early barcoding bulk RNA-seq method. We show that it performs equivalently to TruSeq, a standard bulk RNA-seq method, but is fourfold more cost-efficient due to almost 50-fold cheaper library costs. We also validate a direct RNA isolation step, show that intronic reads are derived from RNA, and compare cost-efficiencies of available protocols. We conclude that prime-seq is currently one of the best options to set up an early barcoding bulk RNA-seq protocol from which many labs would profit.


Subject(s)
RNA , Base Sequence , Gene Library , RNA/genetics , Sequence Analysis, RNA/methods , Exome Sequencing
4.
J Hematol Oncol ; 15(1): 25, 2022 03 12.
Article in English | MEDLINE | ID: mdl-35279202

ABSTRACT

Acute myeloid leukemia (AML) patients suffer dismal prognosis upon treatment resistance. To study functional heterogeneity of resistance, we generated serially transplantable patient-derived xenograft (PDX) models from one patient with AML and twelve clones thereof, each derived from a single stem cell, as proven by genetic barcoding. Transcriptome and exome sequencing segregated clones according to their origin from relapse one or two. Undetectable for sequencing, multiplex fluorochrome-guided competitive in vivo treatment trials identified a subset of relapse two clones as uniquely resistant to cytarabine treatment. Transcriptional and proteomic profiles obtained from resistant PDX clones and refractory AML patients defined a 16-gene score that was predictive of clinical outcome in a large independent patient cohort. Thus, we identified novel genes related to cytarabine resistance and provide proof of concept that intra-tumor heterogeneity reflects inter-tumor heterogeneity in AML.


Subject(s)
Leukemia, Myeloid, Acute , Proteomics , Clone Cells , Cytarabine/therapeutic use , Drug Resistance, Neoplasm/genetics , Humans , Leukemia, Myeloid, Acute/drug therapy , Leukemia, Myeloid, Acute/genetics , Leukemia, Myeloid, Acute/pathology , Recurrence , Stem Cells/pathology
5.
PLoS Genet ; 17(5): e1009587, 2021 05.
Article in English | MEDLINE | ID: mdl-34033652

ABSTRACT

Human pluripotent stem cells (PSCs) express human endogenous retrovirus type-H (HERV-H), which exists as more than a thousand copies on the human genome and frequently produces chimeric transcripts as long-non-coding RNAs (lncRNAs) fused with downstream neighbor genes. Previous studies showed that HERV-H expression is required for the maintenance of PSC identity, and aberrant HERV-H expression attenuates neural differentiation potentials, however, little is known about the actual of function of HERV-H. In this study, we focused on ESRG, which is known as a PSC-related HERV-H-driven lncRNA. The global transcriptome data of various tissues and cell lines and quantitative expression analysis of PSCs showed that ESRG expression is much higher than other HERV-Hs and tightly silenced after differentiation. However, the loss of function by the complete excision of the entire ESRG gene body using a CRISPR/Cas9 platform revealed that ESRG is dispensable for the maintenance of the primed and naïve pluripotent states. The loss of ESRG hardly affected the global gene expression of PSCs or the differentiation potential toward trilineage. Differentiated cells derived from ESRG-deficient PSCs retained the potential to be reprogrammed into induced PSCs (iPSCs) by the forced expression of OCT3/4, SOX2, and KLF4. In conclusion, ESRG is dispensable for the maintenance and recapturing of human pluripotency.


Subject(s)
Pluripotent Stem Cells/metabolism , RNA, Long Noncoding/genetics , Cell Differentiation/genetics , Cells, Cultured , Cellular Reprogramming , Female , Gene Silencing , Humans , Kruppel-Like Factor 4 , Neural Stem Cells/cytology , Neural Stem Cells/metabolism , Pluripotent Stem Cells/cytology
6.
Cell Syst ; 12(3): 248-262.e7, 2021 03 17.
Article in English | MEDLINE | ID: mdl-33592194

ABSTRACT

Aggressive brain tumors like glioblastoma depend on support by their local environment and subsets of tumor parenchymal cells may promote specific phases of disease progression. We investigated the glioblastoma microenvironment with transgenic lineage-tracing models, intravital imaging, single-cell transcriptomics, immunofluorescence analysis as well as histopathology and characterized a previously unacknowledged population of tumor-associated cells with a myeloid-like expression profile (TAMEP) that transiently appeared during glioblastoma growth. TAMEP of mice and humans were identified with specific markers. Notably, TAMEP did not derive from microglia or peripheral monocytes but were generated by a fraction of CNS-resident, SOX2-positive progenitors. Abrogation of this progenitor cell population, by conditional Sox2-knockout, drastically reduced glioblastoma vascularization and size. Hence, TAMEP emerge as a tumor parenchymal component with a strong impact on glioblastoma progression.


Subject(s)
Brain Neoplasms/blood supply , Brain Neoplasms/pathology , Glioblastoma/blood supply , Glioblastoma/pathology , Myeloid Cells/pathology , Animals , Brain Neoplasms/drug therapy , Cell Line, Tumor , Disease Progression , Humans , Male , Mice , Parenchymal Tissue/blood supply , Parenchymal Tissue/pathology
7.
Nat Commun ; 10(1): 4667, 2019 10 11.
Article in English | MEDLINE | ID: mdl-31604912

ABSTRACT

The recent rapid spread of single cell RNA sequencing (scRNA-seq) methods has created a large variety of experimental and computational pipelines for which best practices have not yet been established. Here, we use simulations based on five scRNA-seq library protocols in combination with nine realistic differential expression (DE) setups to systematically evaluate three mapping, four imputation, seven normalisation and four differential expression testing approaches resulting in ~3000 pipelines, allowing us to also assess interactions among pipeline steps. We find that choices of normalisation and library preparation protocols have the biggest impact on scRNA-seq analyses. Specifically, we find that library preparation determines the ability to detect symmetric expression differences, while normalisation dominates pipeline performance in asymmetric DE-setups. Finally, we illustrate the importance of informed choices by showing that a good scRNA-seq pipeline can have the same impact on detecting a biological signal as quadrupling the sample size.


Subject(s)
RNA-Seq/standards , Single-Cell Analysis , Animals , Chromosome Mapping , Computer Simulation , Electronic Data Processing/methods , Mice
8.
Nat Commun ; 9(1): 2937, 2018 07 26.
Article in English | MEDLINE | ID: mdl-30050112

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) has emerged as a central genome-wide method to characterize cellular identities and processes. Consequently, improving its sensitivity, flexibility, and cost-efficiency can advance many research questions. Among the flexible plate-based methods, single-cell RNA barcoding and sequencing (SCRB-seq) is highly sensitive and efficient. Here, we systematically evaluate experimental conditions of this protocol and find that adding polyethylene glycol considerably increases sensitivity by enhancing cDNA synthesis. Furthermore, using Terra polymerase increases efficiency due to a more even cDNA amplification that requires less sequencing of libraries. We combined these and other improvements to develop a scRNA-seq library protocol we call molecular crowding SCRB-seq (mcSCRB-seq), which we show to be one of the most sensitive, efficient, and flexible scRNA-seq methods to date.


Subject(s)
RNA/genetics , Sequence Analysis, RNA/methods , Base Sequence , High-Throughput Nucleotide Sequencing/methods , Single-Cell Analysis , Software
9.
Gigascience ; 7(6)2018 06 01.
Article in English | MEDLINE | ID: mdl-29846586

ABSTRACT

Background: Single-cell RNA-sequencing (scRNA-seq) experiments typically analyze hundreds or thousands of cells after amplification of the cDNA. The high throughput is made possible by the early introduction of sample-specific bar codes (BCs), and the amplification bias is alleviated by unique molecular identifiers (UMIs). Thus, the ideal analysis pipeline for scRNA-seq data needs to efficiently tabulate reads according to both BC and UMI. Findings: zUMIs is a pipeline that can handle both known and random BCs and also efficiently collapse UMIs, either just for exon mapping reads or for both exon and intron mapping reads. If BC annotation is missing, zUMIs can accurately detect intact cells from the distribution of sequencing reads. Another unique feature of zUMIs is the adaptive downsampling function that facilitates dealing with hugely varying library sizes but also allows the user to evaluate whether the library has been sequenced to saturation. To illustrate the utility of zUMIs, we analyzed a single-nucleus RNA-seq dataset and show that more than 35% of all reads map to introns. Also, we show that these intronic reads are informative about expression levels, significantly increasing the number of detected genes and improving the cluster resolution. Conclusions: zUMIs flexibility makes if possible to accommodate data generated with any of the major scRNA-seq protocols that use BCs and UMIs and is the most feature-rich, fast, and user-friendly pipeline to process such scRNA-seq data.


Subject(s)
Sequence Analysis, RNA/methods , Software , Gene Expression Regulation , HEK293 Cells , Humans , Introns/genetics
10.
Brief Funct Genomics ; 17(4): 220-232, 2018 07 01.
Article in English | MEDLINE | ID: mdl-29579145

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) is currently transforming our understanding of biology, as it is a powerful tool to resolve cellular heterogeneity and molecular networks. Over 50 protocols have been developed in recent years and also data processing and analyzes tools are evolving fast. Here, we review the basic principles underlying the different experimental protocols and how to benchmark them. We also review and compare the essential methods to process scRNA-seq data from mapping, filtering, normalization and batch corrections to basic differential expression analysis. We hope that this helps to choose appropriate experimental and computational methods for the research question at hand.


Subject(s)
Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Cell Separation , DNA, Complementary/biosynthesis
11.
Clin Cancer Res ; 24(7): 1716-1726, 2018 04 01.
Article in English | MEDLINE | ID: mdl-29330206

ABSTRACT

Purpose: To study mechanisms of therapy resistance and disease progression, we analyzed the evolution of cytogenetically normal acute myeloid leukemia (CN-AML) based on somatic alterations.Experimental Design: We performed exome sequencing of matched diagnosis, remission, and relapse samples from 50 CN-AML patients treated with intensive chemotherapy. Mutation patterns were correlated with clinical parameters.Results: Evolutionary patterns correlated with clinical outcome. Gain of mutations was associated with late relapse. Alterations of epigenetic regulators were frequently gained at relapse with recurring alterations of KDM6A constituting a mechanism of cytarabine resistance. Low KDM6A expression correlated with adverse clinical outcome, particularly in male patients. At complete remission, persistent mutations representing preleukemic lesions were observed in 48% of patients. The persistence of DNMT3A mutations correlated with shorter time to relapse.Conclusions: Chemotherapy resistance might be acquired through gain of mutations. Insights into the evolution during therapy and disease progression lay the foundation for tailored approaches to treat or prevent relapse of CN-AML. Clin Cancer Res; 24(7); 1716-26. ©2018 AACR.


Subject(s)
Exome/genetics , Leukemia, Myeloid, Acute/genetics , Adult , Aged , Aged, 80 and over , Cell Line , Cytarabine/pharmacology , Cytogenetics/methods , DNA (Cytosine-5-)-Methyltransferases/genetics , Drug Resistance/drug effects , Drug Resistance/genetics , Epigenesis, Genetic/drug effects , Epigenesis, Genetic/genetics , Female , Histone Demethylases/genetics , Humans , Leukemia, Myeloid, Acute/drug therapy , Male , Middle Aged , Mutation/drug effects , Mutation/genetics , Recurrence , Remission Induction/methods , Exome Sequencing/methods , Young Adult
12.
Bioinformatics ; 33(21): 3486-3488, 2017 Nov 01.
Article in English | MEDLINE | ID: mdl-29036287

ABSTRACT

SUMMARY: Power analysis is essential to optimize the design of RNA-seq experiments and to assess and compare the power to detect differentially expressed genes in RNA-seq data. PowsimR is a flexible tool to simulate and evaluate differential expression from bulk and especially single-cell RNA-seq data making it suitable for a priori and posterior power analyses. AVAILABILITY AND IMPLEMENTATION: The R package and associated tutorial are freely available at https://github.com/bvieth/powsimR. CONTACT: vieth@bio.lmu.de or hellmann@bio.lmu.de. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Software , Single-Cell Analysis
13.
Epigenetics Chromatin ; 10(1): 39, 2017 08 07.
Article in English | MEDLINE | ID: mdl-28784182

ABSTRACT

BACKGROUND: The association of active transcription regulatory elements (TREs) with DNAse I hypersensitivity (DHS[+]) and an 'open' local chromatin configuration has long been known. However, the 3D topography of TREs within the nuclear landscape of individual cells in relation to their active or inactive status has remained elusive. Here, we explored the 3D nuclear topography of active and inactive TREs in the context of a recently proposed model for a functionally defined nuclear architecture, where an active and an inactive nuclear compartment (ANC-INC) form two spatially co-aligned and functionally interacting networks. RESULTS: Using 3D structured illumination microscopy, we performed 3D FISH with differently labeled DNA probe sets targeting either sites with DHS[+], apparently active TREs, or DHS[-] sites harboring inactive TREs. Using an in-house image analysis tool, DNA targets were quantitatively mapped on chromatin compaction shaped 3D nuclear landscapes. Our analyses present evidence for a radial 3D organization of chromatin domain clusters (CDCs) with layers of increasing chromatin compaction from the periphery to the CDC core. Segments harboring active TREs are significantly enriched at the decondensed periphery of CDCs with loops penetrating into interchromatin compartment channels, constituting the ANC. In contrast, segments lacking active TREs (DHS[-]) are enriched toward the compacted interior of CDCs (INC). CONCLUSIONS: Our results add further evidence in support of the ANC-INC network model. The different 3D topographies of DHS[+] and DHS[-] sites suggest positional changes of TREs between the ANC and INC depending on their functional state, which might provide additional protection against an inappropriate activation. Our finding of a structural organization of CDCs based on radially arranged layers of different chromatin compaction levels indicates a complex higher-order chromatin organization beyond a dichotomic classification of chromatin into an 'open,' active and 'closed,' inactive state.


Subject(s)
Chromatin/ultrastructure , Regulatory Sequences, Nucleic Acid , Transcriptional Activation , Cell Line, Tumor , Cell Nucleus/metabolism , Cell Nucleus/ultrastructure , Chromatin/genetics , Chromatin/metabolism , Gene Regulatory Networks , Humans , In Situ Hybridization, Fluorescence/methods , Single Molecule Imaging/methods
14.
Nucleus ; 8(5): 548-562, 2017 09 03.
Article in English | MEDLINE | ID: mdl-28524723

ABSTRACT

One of the major functions of DNA methylation is the repression of transposable elements, such as the long-interspersed nuclear element 1 (L1). The underlying mechanism(s), however, are unclear. Here, we addressed how retrotransposon activation and mobilization are regulated by methyl-cytosine modifying ten-eleven-translocation (Tet) proteins and how this is modulated by methyl-CpG binding domain (MBD) proteins. We show that Tet1 activates both, endogenous and engineered L1 retrotransposons. Furthermore, we found that Mecp2 and Mbd2 repress Tet1-mediated activation of L1 by preventing 5hmC formation at the L1 promoter. Finally, we demonstrate that the methyl-CpG binding domain, as well as the adjacent non-sequence specific DNA binding domain of Mecp2 are each sufficient to mediate repression of Tet1-induced L1 mobilization. Our study reveals a mechanism how L1 elements get activated in the absence of Mecp2 and suggests that Tet1 may contribute to Mecp2/Mbd2-deficiency phenotypes, such as the Rett syndrome. We propose that the balance between methylation "reader" and "eraser/writer" controls L1 retrotransposition.


Subject(s)
DNA Transposable Elements/genetics , Methyl-CpG-Binding Protein 2/metabolism , Mixed Function Oxygenases/metabolism , Proto-Oncogene Proteins/metabolism , Animals , Cell Line , DNA-Binding Proteins/metabolism , Humans , Mice
15.
Mol Cell ; 65(4): 631-643.e4, 2017 Feb 16.
Article in English | MEDLINE | ID: mdl-28212749

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) offers new possibilities to address biological and medical questions. However, systematic comparisons of the performance of diverse scRNA-seq protocols are lacking. We generated data from 583 mouse embryonic stem cells to evaluate six prominent scRNA-seq methods: CEL-seq2, Drop-seq, MARS-seq, SCRB-seq, Smart-seq, and Smart-seq2. While Smart-seq2 detected the most genes per cell and across cells, CEL-seq2, Drop-seq, MARS-seq, and SCRB-seq quantified mRNA levels with less amplification noise due to the use of unique molecular identifiers (UMIs). Power simulations at different sequencing depths showed that Drop-seq is more cost-efficient for transcriptome quantification of large numbers of cells, while MARS-seq, SCRB-seq, and Smart-seq2 are more efficient when analyzing fewer cells. Our quantitative comparison offers the basis for an informed choice among six prominent scRNA-seq methods, and it provides a framework for benchmarking further improvements of scRNA-seq protocols.


Subject(s)
Embryonic Stem Cells/chemistry , High-Throughput Nucleotide Sequencing , RNA/genetics , Sequence Analysis, RNA/methods , Single-Cell Analysis/methods , Animals , Base Sequence , Cell Line , Computer Simulation , Cost-Benefit Analysis , High-Throughput Nucleotide Sequencing/economics , Mice , Models, Economic , RNA/isolation & purification , Sequence Analysis, RNA/economics , Single-Cell Analysis/economics
16.
Nucleic Acids Res ; 45(5): 2438-2457, 2017 03 17.
Article in English | MEDLINE | ID: mdl-27923996

ABSTRACT

Aberrant DNA methylation is a hallmark of various human disorders, indicating that the spatial and temporal regulation of methylation readers and modifiers is imperative for development and differentiation. In particular, the cross-regulation between 5-methylcytosine binders (MBD) and modifiers (Tet) has not been investigated. Here, we show that binding of Mecp2 and Mbd2 to DNA protects 5-methylcytosine from Tet1-mediated oxidation. The mechanism is not based on competition for 5-methylcytosine binding but on Mecp2 and Mbd2 directly restricting Tet1 access to DNA. We demonstrate that the efficiency of this process depends on the number of bound MBDs per DNA molecule. Accordingly, we find 5-hydroxymethylcytosine enriched at heterochromatin of Mecp2-deficient neurons of a mouse model for Rett syndrome and Tet1-induced reexpression of silenced major satellite repeats. These data unveil fundamental regulatory mechanisms of Tet enzymes and their potential pathophysiological role in Rett syndrome. Importantly, it suggests that Mecp2 and Mbd2 have an essential physiological role as guardians of the epigenome.


Subject(s)
5-Methylcytosine/metabolism , DNA-Binding Proteins/metabolism , DNA/metabolism , Methyl-CpG-Binding Protein 2/metabolism , Proto-Oncogene Proteins/metabolism , 5-Methylcytosine/analogs & derivatives , Animals , Cells, Cultured , DNA/chemistry , DNA, Satellite/metabolism , DNA-Binding Proteins/antagonists & inhibitors , Humans , Male , Methyl-CpG-Binding Protein 2/genetics , Mice , Mice, Inbred C57BL , Mice, Knockout , Neurons/metabolism , Oxidation-Reduction , Proto-Oncogene Proteins/antagonists & inhibitors , Rats , Rett Syndrome/metabolism , Transcription, Genetic
17.
Bioinformatics ; 32(12): 1895-7, 2016 06 15.
Article in English | MEDLINE | ID: mdl-27153702

ABSTRACT

UNLABELLED: SweepFinder is a widely used program that implements a powerful likelihood-based method for detecting recent positive selection, or selective sweeps. Here, we present SweepFinder2, an extension of SweepFinder with increased sensitivity and robustness to the confounding effects of mutation rate variation and background selection. Moreover, SweepFinder2 has increased flexibility that enables the user to specify test sites, set the distance between test sites and utilize a recombination map. AVAILABILITY AND IMPLEMENTATION: SweepFinder2 is a freely-available (www.personal.psu.edu/mxd60/sf2.html) software package that is written in C and can be run from a Unix command line. CONTACT: mxd60@psu.edu.


Subject(s)
Mutation Rate , Selection, Genetic , Software , Evolution, Molecular , Humans , Likelihood Functions
18.
Sci Rep ; 6: 25533, 2016 05 09.
Article in English | MEDLINE | ID: mdl-27156886

ABSTRACT

Currently, quantitative RNA-seq methods are pushed to work with increasingly small starting amounts of RNA that require amplification. However, it is unclear how much noise or bias amplification introduces and how this affects precision and accuracy of RNA quantification. To assess the effects of amplification, reads that originated from the same RNA molecule (PCR-duplicates) need to be identified. Computationally, read duplicates are defined by their mapping position, which does not distinguish PCR- from natural duplicates and hence it is unclear how to treat duplicated reads. Here, we generate and analyse RNA-seq data sets prepared using three different protocols (Smart-Seq, TruSeq and UMI-seq). We find that a large fraction of computationally identified read duplicates are not PCR duplicates and can be explained by sampling and fragmentation bias. Consequently, the computational removal of duplicates does improve neither accuracy nor precision and can actually worsen the power and the False Discovery Rate (FDR) for differential gene expression. Even when duplicates are experimentally identified by unique molecular identifiers (UMIs), power and FDR are only mildly improved. However, the pooling of samples as made possible by the early barcoding of the UMI-protocol leads to an appreciable increase in the power to detect differentially expressed genes.


Subject(s)
Gene Expression Profiling/methods , Sequence Analysis, RNA/methods , Databases, Genetic , Gene Expression Regulation , Gene Library , HCT116 Cells , Humans
19.
Mol Ecol ; 25(1): 142-56, 2016 Jan.
Article in English | MEDLINE | ID: mdl-26290347

ABSTRACT

A composite likelihood ratio test implemented in the program sweepfinder is a commonly used method for scanning a genome for recent selective sweeps. sweepfinder uses information on the spatial pattern (along the chromosome) of the site frequency spectrum around the selected locus. To avoid confounding effects of background selection and variation in the mutation process along the genome, the method is typically applied only to sites that are variable within species. However, the power to detect and localize selective sweeps can be greatly improved if invariable sites are also included in the analysis. In the spirit of a Hudson-Kreitman-Aguadé test, we suggest adding fixed differences relative to an out-group to account for variation in mutation rate, thereby facilitating more robust and powerful analyses. We also develop a method for including background selection, modelled as a local reduction in the effective population size. Using simulations, we show that these advances lead to a gain in power while maintaining robustness to mutation rate variation. Furthermore, the new method also provides more precise localization of the causative mutation than methods using the spatial pattern of segregating sites alone.


Subject(s)
Genetics, Population , Models, Genetic , Mutation Rate , Selection, Genetic , Gene Frequency , Humans
20.
Mol Biol Evol ; 31(11): 3026-39, 2014 Nov.
Article in English | MEDLINE | ID: mdl-25158800

ABSTRACT

Detecting positive selection in species with heterogeneous habitats and complex demography is notoriously difficult and prone to statistical biases. The model plant Arabidopsis thaliana exemplifies this problem: In spite of the large amounts of data, little evidence for classic selective sweeps has been found. Moreover, many aspects of the demography are unclear, which makes it hard to judge whether the few signals are indeed signs of selection, or false positives caused by demographic events. Here, we focus on Swedish A. thaliana and we find that the demography can be approximated as a two-population model. Careful analysis of the data shows that such a two island model is characterized by a very old split time that significantly predates the last glacial maximum followed by secondary contact with strong migration. We evaluate selection based on this demography and find that this secondary contact model strongly affects the power to detect sweeps. Moreover, it affects the power differently for northern Sweden (more false positives) as compared with southern Sweden (more false negatives). However, even when the demographic history is accounted for, sweep signals in northern Sweden are stronger than in southern Sweden, with little or no positional overlap. Further simulations including the complex demography and selection confirm that this is not compatible with global selection acting on both populations, and thus can be taken as evidence for local selection within subpopulations of Swedish A. thaliana. This study demonstrates the necessity of combining demographic analyses and sweep scans for the detection of selection, particularly when selection acts predominantly local.


Subject(s)
Arabidopsis/genetics , Models, Genetic , Plant Dispersal/genetics , Selection, Genetic , Arabidopsis/classification , Gene Flow , Genetic Variation , Phylogeography , Sweden
SELECTION OF CITATIONS
SEARCH DETAIL
...