RESUMO
Single-molecule Real-time Isoform Sequencing (Iso-seq) of transcriptomes by PacBio can generate very long and accurate reads, thus providing an ideal platform for full-length transcriptome analysis. We present an integrated computational toolkit named TAGET for Iso-seq full-length transcript data analyses, including transcript alignment, annotation, gene fusion detection, and quantification analyses such as differential expression gene analysis and differential isoform usage analysis. We evaluate the performance of TAGET using a public Iso-seq dataset and newly sequenced Iso-seq datasets from tumor patients. TAGET gives significantly more precise novel splice site prediction and enables more accurate novel isoform and gene fusion discoveries, as validated by experimental validations and comparisons with RNA-seq data. We identify and experimentally validate a differential isoform usage gene ECM1, and further show that its isoform ECM1b may be a tumor-suppressor in laryngocarcinoma. Our results demonstrate that TAGET provides a valuable computational toolkit and can be applied to many full-length transcriptome studies.
Assuntos
Análise de Dados , Perfilação da Expressão Gênica , Humanos , Fusão Gênica , RNA-Seq , Transcriptoma/genética , Proteínas da Matriz ExtracelularRESUMO
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection has elicited a worldwide pandemic since late 2019. There has been ~675 million confirmed coronavirus disease 2019 (COVID-19) cases, leading to more than 6.8 million deaths as of March 1, 2023. Five SARS-CoV-2 variants of concern (VOCs) were tracked as they emerged and were subsequently characterized. However, it is still difficult to predict the next dominant variant due to the rapid evolution of its spike (S) glycoprotein, which affects the binding activity between cellular receptor angiotensin-converting enzyme 2 (ACE2) and blocks the presenting epitope from humoral monoclonal antibody (mAb) recognition. Here, we established a robust mammalian cell-surface-display platform to study the interactions of S-ACE2 and S-mAb on a large scale. A lentivirus library of S variants was generated via in silico chip synthesis followed by site-directed saturation mutagenesis, after which the enriched candidates were acquired through single-cell fluorescence sorting and analyzed by third-generation DNA sequencing technologies. The mutational landscape provides a blueprint for understanding the key residues of the S protein binding affinity to ACE2 and mAb evasion. It was found that S205F, Y453F, Q493A, Q493M, Q498H, Q498Y, N501F, and N501T showed a 3-12-fold increase in infectivity, of which Y453F, Q493A, and Q498Y exhibited at least a 10-fold resistance to mAbs REGN10933, LY-CoV555, and REGN10987, respectively. These methods for mammalian cells may assist in the precise control of SARS-CoV-2 in the future.