RESUMO
Recent studies showed that somatic cancer mutations target genes that are in specific signaling and cellular pathways. However, in each patient only a few of the pathway genes are mutated. Current approaches consider only existing pathways and ignore the topology of the pathways. For this reason, new efforts have been focused on identifying significantly mutated subnetworks and associating them with cancer characteristics. We applied two well-established network analysis approaches to identify significantly mutated subnetworks in the breast cancer genome. We took network topology into account for measuring the mutation similarity of a gene-pair to allow us to infer the significantly mutated subnetworks. Our goals are to evaluate whether the identified subnetworks can be used as biomarkers for predicting breast cancer patient survival and provide the potential mechanisms of the pathways enriched in the subnetworks, with the aim of improving breast cancer treatment. Using the copy number alteration (CNA) datasets from the METABRIC (Molecular Taxonomy of Breast Cancer International Consortium) study, we identified a significantly mutated yet clinically and functionally relevant subnetwork using two graph-based clustering algorithms. The mutational pattern of the subnetwork is significantly associated with breast cancer survival. The genes in the subnetwork are significantly enriched in retinol metabolism KEGG pathway. Our results show that breast cancer treatment with retinoids may be a potential personalized therapy for breast cancer patients since the CNA patterns of the breast cancer patients can imply whether the retinoids pathway is altered. We also showed that applying multiple bioinformatics algorithms at the same time has the potential to identify new network-based biomarkers, which may be useful for stratifying cancer patients for choosing optimal treatments.
Assuntos
Biomarcadores Tumorais/genética , Neoplasias da Mama/classificação , Neoplasias da Mama/patologia , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Mutação , Transcriptoma , Idoso , Algoritmos , Neoplasias da Mama/genética , Biologia Computacional , Variações do Número de Cópias de DNA , Feminino , Humanos , Pessoa de Meia-Idade , Prognóstico , Mapas de Interação de Proteínas , Taxa de SobrevidaRESUMO
Classification of breast cancer subtypes using multi-omics profiles is a difficult problem since the data sets are high-dimensional and highly correlated. Deep neural network (DNN) learning has demonstrated advantages over traditional methods as it does not require any hand-crafted features, but rather automatically extract features from raw data and efficiently analyze high-dimensional and correlated data. We aim to develop an integrative deep learning framework for classifying molecular subtypes of breast cancer. We collect copy number alteration and gene expression data measured on the same breast cancer patients from the Molecular Taxonomy of Breast Cancer International Consortium. We propose a deep learning model to integrate the omics datasets for predicting their molecular subtypes. The performance of our proposed DNN model is compared with some baseline models. Furthermore, we evaluate the misclassification of the subtypes using the learned deep features and explore their usefulness for clustering the breast cancer patients. We demonstrate that our proposed integrative deep learning model is superior to other deep learning and non-deep learning based models. Particularly, we get the best prediction result among the deep learning-based integration models when we integrate the two data sources using the concatenation layer in the models without sharing the weights. Using the learned deep features, we identify 6 breast cancer subgroups and show that Her2-enriched samples can be classified into more than one tumor subtype. Overall, the integrated model show better performance than those trained on individual data sources.
RESUMO
Many cancers have been linked to copy number variations (CNVs) in the genomic DNA. Although there are existing methods to analyze CNVs from individual samples, cancer-causing genes are more frequently discovered in regions where CNVs are common among tumor samples, also known as recurrent CNVs. Integrating multiple samples and locating recurrent CNV regions remain a challenge, both computationally and conceptually. We propose a new graph-based algorithm for identifying recurrent CNVs using the maximal clique detection technique. The algorithm has an optimal solution, which means all maximal cliques can be identified, and guarantees that the identified CNV regions are the most frequent and that the minimal regions have been delineated among tumor samples. The algorithm has successfully been applied to analyze a large cohort of breast cancer samples and identified some breast cancer-associated genes and pathways.