RESUMEN
RNA-binding proteins (RBPs) are central actors of RNA post-transcriptional regulation. Experiments to profile-binding sites of RBPs in vivo are limited to transcripts expressed in the experimental cell type, creating the need for computational methods to infer missing binding information. While numerous machine-learning based methods have been developed for this task, their use of heterogeneous training and evaluation datasets across different sets of RBPs and CLIP-seq protocols makes a direct comparison of their performance difficult. Here, we compile a set of 37 machine learning (primarily deep learning) methods for in vivo RBP-RNA interaction prediction and systematically benchmark a subset of 11 representative methods across hundreds of CLIP-seq datasets and RBPs. Using homogenized sample pre-processing and two negative-class sample generation strategies, we evaluate methods in terms of predictive performance and assess the impact of neural network architectures and input modalities on model performance. We believe that this study will not only enable researchers to choose the optimal prediction method for their tasks at hand, but also aid method developers in developing novel, high-performing methods by introducing a standardized framework for their evaluation.
Asunto(s)
Benchmarking , Secuenciación de Inmunoprecipitación de Cromatina , Sitios de Unión , Aprendizaje Automático , ARN/genéticaRESUMEN
We present RBPNet, a novel deep learning method, which predicts CLIP-seq crosslink count distribution from RNA sequence at single-nucleotide resolution. By training on up to a million regions, RBPNet achieves high generalization on eCLIP, iCLIP and miCLIP assays, outperforming state-of-the-art classifiers. RBPNet performs bias correction by modeling the raw signal as a mixture of the protein-specific and background signal. Through model interrogation via Integrated Gradients, RBPNet identifies predictive sub-sequences that correspond to known and novel binding motifs and enables variant-impact scoring via in silico mutagenesis. Together, RBPNet improves imputation of protein-RNA interactions, as well as mechanistic interpretation of predictions.
Asunto(s)
Secuencia de Bases , Simulación por Computador , Aprendizaje Profundo , Proteínas de Unión al ARN , ARN , Humanos , Alelos , Sesgo , Sitios de Unión , Secuencia de Consenso , Conjuntos de Datos como Asunto , Internet , Mutación , Motivos de Nucleótidos , Nucleótidos/metabolismo , ARN/química , ARN/genética , ARN/metabolismo , Sitios de Empalme de ARN , ARN Mensajero/química , ARN Mensajero/genética , ARN Mensajero/metabolismo , ARN Viral/química , ARN Viral/genética , ARN Viral/metabolismo , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/metabolismoRESUMEN
While glioblastoma (GBM) is still challenging to treat, novel immunotherapeutic approaches have shown promising effects in preclinical settings. However, their clinical breakthrough is hampered by complex interactions of GBM with the tumor microenvironment (TME). Here, we present an analysis of TME composition in a patient-derived organoid model (PDO) as well as in organotypic slice cultures (OSC). To obtain a more realistic model for immunotherapeutic testing, we introduce an enhanced PDO model. We manufactured PDOs and OSCs from fresh tissue of GBM patients and analyzed the TME. Enhanced PDOs (ePDOs) were obtained via co-culture with PBMCs (peripheral blood mononuclear cells) and compared to normal PDOs (nPDOs) and PT (primary tissue). At first, we showed that TME was not sustained in PDOs after a short time of culture. In contrast, TME was largely maintained in OSCs. Unfortunately, OSCs can only be cultured for up to 9 days. Thus, we enhanced the TME in PDOs by co-culturing PDOs and PBMCs from healthy donors. These cellular TME patterns could be preserved until day 21. The ePDO approach could mirror the interaction of GBM, TME and immunotherapeutic agents and may consequently represent a realistic model for individual immunotherapeutic drug testing in the future.