Your browser doesn't support javascript.
loading
The Sum of Two Halves May Be Different from the Whole-Effects of Splitting Sequencing Samples Across Lanes.
Williams, Eleanor C; Chazarra-Gil, Ruben; Shahsavari, Arash; Mohorianu, Irina.
Afiliação
  • Williams EC; Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK.
  • Chazarra-Gil R; Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK.
  • Shahsavari A; Life Sciences-Transcriptomics and Functional Genomics Lab, Barcelona Supercomputing Center (BSC-CNS), 08034 Barcelona, Spain.
  • Mohorianu I; Wellcome-MRC Cambridge Stem Cell Institute, University of Cambridge, Cambridge CB2 0AW, UK.
Genes (Basel) ; 13(12)2022 12 01.
Article em En | MEDLINE | ID: mdl-36553532
ABSTRACT
The advances in high-throughput sequencing (HTS) have enabled the characterisation of biological processes at an unprecedented level of detail; most hypotheses in molecular biology rely on analyses of HTS data. However, achieving increased robustness and reproducibility of results remains a main challenge. Although variability in results may be introduced at various stages, e.g., alignment, summarisation or detection of differential expression, one source of variability was systematically omitted the sequencing design, which propagates through analyses and may introduce an additional layer of technical variation. We illustrate qualitative and quantitative differences arising from splitting samples across lanes on bulk and single-cell sequencing. For bulk mRNAseq data, we focus on differential expression and enrichment analyses; for bulk ChIPseq data, we investigate the effect on peak calling and the peaks' properties. At the single-cell level, we concentrate on identifying cell subpopulations. We rely on markers used for assigning cell identities; both smartSeq and 10× data are presented. The observed reduction in the number of unique sequenced fragments limits the level of detail on which the different prediction approaches depend. Furthermore, the sequencing stochasticity adds in a weighting bias corroborated with variable sequencing depths and (yet unexplained) sequencing bias. Subsequently, we observe an overall reduction in sequencing complexity and a distortion in the biological signal across technologies, experimental contexts, organisms and tissues.
Assuntos
Palavras-chave

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Prognostic_studies / Qualitative_research Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo: 1 Coleções: 01-internacional Base de dados: MEDLINE Assunto principal: Sequenciamento de Nucleotídeos em Larga Escala Tipo de estudo: Prognostic_studies / Qualitative_research Idioma: En Ano de publicação: 2022 Tipo de documento: Article