Your browser doesn't support javascript.
loading
Evaluating the Predictability of Cancer Types from 536 Somatic Mutations: A New Dataset.
Annu Int Conf IEEE Eng Med Biol Soc ; 2020: 5308-5311, 2020 07.
Article in En | MEDLINE | ID: mdl-33019182
ABSTRACT
In this paper, we introduce a new dataset for cancer research containing somatic mutation states of 536 genes of the Cancer Gene Census (CGC). We used somatic mutation information from the Cancer Genome Atlas (TCGA) projects to create this dataset. As preliminary investigations, we employed machine learning techniques, including k-Nearest Neighbors, Decision Tree, Random Forest, and Artificial Neural Networks (ANNs) to evaluate the potential of these somatic mutations for classification of cancer types. We compared our models on accuracy, precision, recall, and F1-score. We observed that ANNs outperformed the other models with F1-score of 0.36 and overall classification accuracy of 40%, and precision ranging from 12% to 92% for different cancer types. The 40% accuracy is significantly higher than random guessing which would have resulted in 3% overall classification accuracy. Although the model has relatively low overall accuracy, it has an average classification specificity of 98%. The ANN achieved high precision scores (> 0.7) for 5 of the 33 cancer types. The introduced dataset can be used for research on TCGA data, such as survival analysis, histopathology image analysis and content-based image retrieval. The dataset is available online for download https//kimialab.uwaterloo.ca/kimia/.
Subject(s)

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Neural Networks, Computer / Neoplasms Type of study: Diagnostic_studies / Prognostic_studies / Risk_factors_studies Limits: Humans Language: En Journal: Annu Int Conf IEEE Eng Med Biol Soc Year: 2020 Type: Article

Full text: 1 Collection: 01-internacional Database: MEDLINE Main subject: Neural Networks, Computer / Neoplasms Type of study: Diagnostic_studies / Prognostic_studies / Risk_factors_studies Limits: Humans Language: En Journal: Annu Int Conf IEEE Eng Med Biol Soc Year: 2020 Type: Article