Improving CLIP-seq data analysis by incorporating transcript information.
BMC Genomics
; 21(1): 894, 2020 Dec 17.
Article
en En
| MEDLINE
| ID: mdl-33334306
BACKGROUND: Current peak callers for identifying RNA-binding protein (RBP) binding sites from CLIP-seq data take into account genomic read profiles, but they ignore the underlying transcript information, that is information regarding splicing events. So far, there are no studies available that closer observe this issue. RESULTS: Here we show that current peak callers are susceptible to false peak calling near exon borders. We quantify its extent in publicly available datasets, which turns out to be substantial. By providing a tool called CLIPcontext for automatic transcript and genomic context sequence extraction, we further demonstrate that context choice affects the performances of RBP binding site prediction tools. Moreover, we show that known motifs of exon-binding RBPs are often enriched in transcript context sites, which should enable the recovery of more authentic binding sites. Finally, we discuss possible strategies on how to integrate transcript information into future workflows. CONCLUSIONS: Our results demonstrate the importance of incorporating transcript information in CLIP-seq data analysis. Taking advantage of the underlying transcript information should therefore become an integral part of future peak calling and downstream analysis tools.
Palabras clave
Texto completo:
1
Colección:
01-internacional
Base de datos:
MEDLINE
Asunto principal:
Análisis de Datos
/
Secuenciación de Inmunoprecipitación de Cromatina
Idioma:
En
Revista:
BMC Genomics
Asunto de la revista:
GENETICA
Año:
2020
Tipo del documento:
Article
País de afiliación:
Alemania
Pais de publicación:
Reino Unido