RESUMO
BACKGROUND: It has recently been shown that significant and accurate single nucleotide variants (SNVs) can be reliably called from RNA-Seq data. These may provide another source of features for multivariate predictive modeling of disease phenotype for the prioritization of candidate biomarkers. The continuous nature of SNV allele fraction features allows the concurrent investigation of several genomic phenomena, including allele specific expression, clonal expansion and/or deletion, and copy number variation. RESULTS: The proposed software pipeline and package, SNV Discriminant Analysis (SNV-DA), was applied on two RNA-Seq datasets with varying sample sizes sequenced at different depths: a dataset containing primary tumors from twenty patients with different disease outcomes in lung adenocarcinoma and a larger dataset of primary tumors representing two major breast cancer subtypes, estrogen receptor positive and triple negative. Predictive models were generated using the machine learning algorithm, sparse projections to latent structures discriminant analysis. Training sets composed of RNA-Seq SNV features limited to genomic regions of origin (e.g. exonic or intronic) and/or RNA-editing sites were shown to produce models with accurate predictive performances, were discriminant towards true label groupings, and were able to produce SNV rankings significantly different from than univariate tests. Furthermore, the utility of the proposed methodology is supported by its comparable performance to traditional models as well as the enrichment of selected SNVs located in genes previously associated with cancer and genes showing allele-specific expression. As proof of concept, we highlight the discovery of a previously unannotated intergenic locus that is associated with epigenetic regulatory marks in cancer and whose significant allele-specific expression is correlated with ER+ status; hereafter named ER+ associated hotspot (ERPAHS). CONCLUSION: The use of models from RNA-Seq SNVs to identify and prioritize candidate molecular targets for biomarker discovery is supported by the ability of the proposed method to produce significantly accurate predictive models that are discriminant towards true label groupings. Importantly, the proposed methodology allows investigation of mutations outside of exonic regions and identification of interesting expressed loci not included in traditional gene annotations. An implementation of the proposed methodology is provided that allows the user to specify SNV filtering criteria and cross-validation design during model creation and evaluation.
Assuntos
Modelos Genéticos , Polimorfismo de Nucleotídeo Único , Análise de Sequência de RNA , Software , Regiões 3' não Traduzidas , Adenocarcinoma/genética , Adenocarcinoma de Pulmão , Algoritmos , Biomarcadores Tumorais/genética , Neoplasias da Mama/genética , Carcinoma Pulmonar de Células não Pequenas/genética , DNA Intergênico/genética , Análise Discriminante , Éxons , Feminino , Humanos , Íntrons , Neoplasias Pulmonares/genética , Edição de RNARESUMO
Currently, the majority of animal models that are used to study biofilm-related infections use planktonic bacterial cells as initial inocula to produce positive signals of infection in biomaterials studies. However, the use of planktonic cells has potentially led to inconsistent results in infection outcomes. In this study, well-established biofilms of methicillin-resistant Staphylococcus aureus were grown and used as initial inocula in an animal model of a Type IIIB open fracture. The goal of the work was to establish, for the first time, a repeatable model of biofilm implant-related osteomyelitis, wherein biofilms were used as initial inocula to test combination biomaterials. Results showed that 100% of animals that were treated with biofilms developed osteomyelitis, whereas 0% of animals not treated with biofilm developed infection. The development of this experimental model may lead to an important shift in biofilm and biomaterials research by showing that when biofilms are used as initial inocula, they may provide additional insights into how biofilm-related infections in the clinic develop and how they can be treated with combination biomaterials to eradicate and/or prevent biofilm formation.