Your browser doesn't support javascript.
loading
svaseq: removing batch effects and other unwanted noise from sequencing data.
Leek, Jeffrey T.
Afiliación
  • Leek JT; Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health Baltimore, MD 21212, US jtleek@gmail.com.
Nucleic Acids Res ; 42(21)2014 Dec 01.
Article en En | MEDLINE | ID: mdl-25294822
It is now known that unwanted noise and unmodeled artifacts such as batch effects can dramatically reduce the accuracy of statistical inference in genomic experiments. These sources of noise must be modeled and removed to accurately measure biological variability and to obtain correct statistical inference when performing high-throughput genomic analysis. We introduced surrogate variable analysis (sva) for estimating these artifacts by (i) identifying the part of the genomic data only affected by artifacts and (ii) estimating the artifacts with principal components or singular vectors of the subset of the data matrix. The resulting estimates of artifacts can be used in subsequent analyses as adjustment factors to correct analyses. Here I describe a version of the sva approach specifically created for count data or FPKMs from sequencing experiments based on appropriate data transformation. I also describe the addition of supervised sva (ssva) for using control probes to identify the part of the genomic data only affected by artifacts. I present a comparison between these versions of sva and other methods for batch effect estimation on simulated data, real count-based data and FPKM-based data. These updates are available through the sva Bioconductor package and I have made fully reproducible analysis using these methods available from: https://github.com/jtleek/svaseq.
Asunto(s)

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Programas Informáticos / Artefactos / Perfilación de la Expresión Génica / Secuenciación de Nucleótidos de Alto Rendimiento Tipo de estudio: Prognostic_studies Límite: Animals Idioma: En Revista: Nucleic Acids Res Año: 2014 Tipo del documento: Article

Texto completo: 1 Colección: 01-internacional Banco de datos: MEDLINE Asunto principal: Programas Informáticos / Artefactos / Perfilación de la Expresión Génica / Secuenciación de Nucleótidos de Alto Rendimiento Tipo de estudio: Prognostic_studies Límite: Animals Idioma: En Revista: Nucleic Acids Res Año: 2014 Tipo del documento: Article