Search | VHL Regional Portal

Systematic assessment of long-read RNA-seq methods for transcript identification and quantification.

Pardo-Palacios, Francisco J; Wang, Dingjie; Reese, Fairlie; Diekhans, Mark; Carbonell-Sala, Sílvia; Williams, Brian; Loveland, Jane E; De María, Maite; Adams, Matthew S; Balderrama-Gutierrez, Gabriela; Behera, Amit K; Gonzalez, Jose M; Hunt, Toby; Lagarde, Julien; Liang, Cindy E; Li, Haoran; Jerryd Meade, Marcus; Moraga Amador, David A; Prjibelski, Andrey D; Birol, Inanc; Bostan, Hamed; Brooks, Ashley M; Hasan Çelik, Muhammed; Chen, Ying; Du, Mei R M; Felton, Colette; Göke, Jonathan; Hafezqorani, Saber; Herwig, Ralf; Kawaji, Hideya; Lee, Joseph; Liang Li, Jian; Lienhard, Matthias; Mikheenko, Alla; Mulligan, Dennis; Ming Nip, Ka; Pertea, Mihaela; Ritchie, Matthew E; Sim, Andre D; Tang, Alison D; Kei Wan, Yuk; Wang, Changqing; Wong, Brandon Y; Yang, Chen; Barnes, If; Berry, Andrew; Capella, Salvador; Dhillon, Namrita; Fernandez-Gonzalez, Jose M; Ferrández-Peral, Luis.

bioRxiv ; 2023 Jul 27.

Article in English | MEDLINE | ID: mdl-37546854

ABSTRACT

The Long-read RNA-Seq Genome Annotation Assessment Project (LRGASP) Consortium was formed to evaluate the effectiveness of long-read approaches for transcriptome analysis. The consortium generated over 427 million long-read sequences from cDNA and direct RNA datasets, encompassing human, mouse, and manatee species, using different protocols and sequencing platforms. These data were utilized by developers to address challenges in transcript isoform detection and quantification, as well as de novo transcript isoform identification. The study revealed that libraries with longer, more accurate sequences produce more accurate transcripts than those with increased read depth, whereas greater read depth improved quantification accuracy. In well-annotated genomes, tools based on reference sequences demonstrated the best performance. When aiming to detect rare and novel transcripts or when using reference-free approaches, incorporating additional orthogonal data and replicate samples are advised. This collaborative study offers a benchmark for current practices and provides direction for future method development in transcriptome analysis.

A comparison framework and guideline of clustering methods for mass cytometry data.

Liu, Xiao; Song, Weichen; Wong, Brandon Y; Zhang, Ting; Yu, Shunying; Lin, Guan Ning; Ding, Xianting.

Genome Biol ; 20(1): 297, 2019 12 23.

Article in English | MEDLINE | ID: mdl-31870419

ABSTRACT

BACKGROUND: With the expanding applications of mass cytometry in medical research, a wide variety of clustering methods, both semi-supervised and unsupervised, have been developed for data analysis. Selecting the optimal clustering method can accelerate the identification of meaningful cell populations. RESULT: To address this issue, we compared three classes of performance measures, "precision" as external evaluation, "coherence" as internal evaluation, and stability, of nine methods based on six independent benchmark datasets. Seven unsupervised methods (Accense, Xshift, PhenoGraph, FlowSOM, flowMeans, DEPECHE, and kmeans) and two semi-supervised methods (Automated Cell-type Discovery and Classification and linear discriminant analysis (LDA)) are tested on six mass cytometry datasets. We compute and compare all defined performance measures against random subsampling, varying sample sizes, and the number of clusters for each method. LDA reproduces the manual labels most precisely but does not rank top in internal evaluation. PhenoGraph and FlowSOM perform better than other unsupervised tools in precision, coherence, and stability. PhenoGraph and Xshift are more robust when detecting refined sub-clusters, whereas DEPECHE and FlowSOM tend to group similar clusters into meta-clusters. The performances of PhenoGraph, Xshift, and flowMeans are impacted by increased sample size, but FlowSOM is relatively stable as sample size increases. CONCLUSION: All the evaluations including precision, coherence, stability, and clustering resolution should be taken into synthetic consideration when choosing an appropriate tool for cytometry data analysis. Thus, we provide decision guidelines based on these characteristics for the general reader to more easily choose the most suitable clustering tools.

Subject(s)

Cell Separation , Software , Cluster Analysis , Discriminant Analysis

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL