RESUMO
Next-generation sequencing (NGS) platforms allow the analysis of hundreds of millions of molecules in a single sequencing run, revolutionizing many research areas. NGS-based microRNA studies enable expression quantification in unprecedented scale without the limitations of closed-platforms. Yet, whereas a massive amount of data produced by these platforms is available, comparisons of quantification/discovery capabilities between platforms are still lacking. Here we compare two NGS-platforms: SOLiD and PGM, by evaluating their microRNA identification/quantification capabilities using two breast-derived cell-lines. A high expression correlation (R2 > 0.9) was achieved, encompassing 97% of the miRNAs, and the few discrepancies in miRNA counts were attributable to molecules that have very low expression. Quantification divergences indicative of artefactual representation were seen for 14 miRNAs (higher in SOLiD-reads) and another 10 miRNAs more abundant in PGM-data. An inspection of these revealed an increased and statistically significant count of uracyls and uracyl-stretches for PGM-enriched miRNAs, compared to SOLiD and to the miRBase. In parallel, adenines and adenine-stretches were enriched for SOLiDderived miRNA reads. We conclude that, whereas both platforms are overall consistent and can be used interchangeably for microRNA expression studies, particular sequence features appear to be indicative of specific platform bias, and their presence in microRNAs should be considered for database-analyses.