Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38215334

RESUMO

Clustering is a common technique for statistical data analysis and is essential for developing precision medicine. Numerous computational methods have been proposed for integrating multi-omics data to identify cancer subtypes. However, most existing clustering models based on network fusion fail to preserve the consistency of the distribution of the data before and after fusion. Motivated by this observation, we would like to measure and minimize the distribution difference between networks, which may not be in the same space, to improve the performance of data fusion. We were therefore motivated to develop a flexible clustering model, based on network fusion, that minimizes the distribution difference between the data before and after fusion by co-regularization; the model can be applied to both single- and multi-omics data. We propose a new network fusion model for single- and multi-omics data clustering for identifying cancer or cell subtypes based on co-regularized network fusion (SMCC). SMCC integrates low-rank subspace representation and entropy to fuse networks. In addition, it measures and minimizes the distribution difference between the similarity networks and the fusion network by co-regularization. The model can both reduce the noise interference in the source data and make the statistical characteristics of the fusion result closer to those of the source data. We evaluated the clustering performance of SMCC across 16 real single- and multi-omics dataset. The experimental results demonstrated that SMCC is superior to 17 state-of-the-art clustering methods. Moreover, it is effective for identifying cancer or cell subtypes, thereby promoting the development of precision medicine.

2.
iScience ; 26(4): 106517, 2023 Apr 21.
Artigo em Inglês | MEDLINE | ID: mdl-37123236

RESUMO

Epithelial-to-mesenchymal transition (EMT) is the underlying mechanism for tumor metastasis and shows the metastatic potential of tumor cells. Although the transcriptional regulation of EMT has been well studied, the role of alternative splicing (AS) regulation in EMT remains largely uncharacterized. The rapid accumulation of RNA-seq datasets has provided the opportunities for developing computational methods to associate mRNA isoform variations with EMT. In this study, we propose regularization models to identify significant AS events during EMT. Our experimental results confirm that the predicted AS events are closely related to apoptosis, focal adhesion-invadopodium shift and tight junction formation that are essential during EMT. Therefore, our study highlights the broad role of posttranscriptional regulation during EMT and identifies key subsets of AS events serving as distinct regulatory nodes.

3.
PLoS Comput Biol ; 19(3): e1010939, 2023 03.
Artigo em Inglês | MEDLINE | ID: mdl-36930678

RESUMO

During breast cancer metastasis, the developmental process epithelial-mesenchymal (EM) transition is abnormally activated. Transcriptional regulatory networks controlling EM transition are well-studied; however, alternative RNA splicing also plays a critical regulatory role during this process. Alternative splicing was proved to control the EM transition process, and RNA-binding proteins were determined to regulate alternative splicing. A comprehensive understanding of alternative splicing and the RNA-binding proteins that regulate it during EM transition and their dynamic impact on breast cancer remains largely unknown. To accurately study the dynamic regulatory relationships, time-series data of the EM transition process are essential. However, only cross-sectional data of epithelial and mesenchymal specimens are available. Therefore, we developed a pseudotemporal causality-based Bayesian (PCB) approach to infer the dynamic regulatory relationships between alternative splicing events and RNA-binding proteins. Our study sheds light on facilitating the regulatory network-based approach to identify key RNA-binding proteins or target alternative splicing events for the diagnosis or treatment of cancers. The data and code for PCB are available at: http://hkumath.hku.hk/~wkc/PCB(data+code).zip.


Assuntos
Neoplasias da Mama , Humanos , Feminino , Neoplasias da Mama/metabolismo , Teorema de Bayes , Estudos Transversais , Linhagem Celular Tumoral , Processos Neoplásicos , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo , Processamento Alternativo/genética , Transição Epitelial-Mesenquimal/genética
4.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35437603

RESUMO

Each type of cancer usually has several subtypes with distinct clinical implications, and therefore the discovery of cancer subtypes is an important and urgent task in disease diagnosis and therapy. Using single-omics data to predict cancer subtypes is difficult because genomes are dysregulated and complicated by multiple molecular mechanisms, and therefore linking cancer genomes to cancer phenotypes is not an easy task. Using multi-omics data to effectively predict cancer subtypes is an area of much interest; however, integrating multi-omics data is challenging. Here, we propose a novel method of multi-omics data integration for clustering to identify cancer subtypes (MDICC) that integrates new affinity matrix and network fusion methods. Our experimental results show the effectiveness and generalization of the proposed MDICC model in identifying cancer subtypes, and its performance was better than those of currently available state-of-the-art clustering methods. Furthermore, the survival analysis demonstrates that MDICC delivered comparable or even better results than many typical integrative methods.


Assuntos
Neoplasias , Análise por Conglomerados , Humanos , Neoplasias/genética , Análise de Sobrevida
5.
Brief Bioinform ; 22(6)2021 11 05.
Artigo em Inglês | MEDLINE | ID: mdl-34410342

RESUMO

MOTIVATION: The epithelial-mesenchymal transition (EMT) is a cellular-developmental process activated during tumor metastasis. Transcriptional regulatory networks controlling EMT are well studied; however, alternative RNA splicing also plays a critical regulatory role during this process. Unfortunately, a comprehensive understanding of alternative splicing (AS) and the RNA-binding proteins (RBPs) that regulate it during EMT remains largely unknown. Therefore, a great need exists to develop effective computational methods for predicting associations of RBPs and AS events. Dramatically increasing data sources that have direct and indirect information associated with RBPs and AS events have provided an ideal platform for inferring these associations. RESULTS: In this study, we propose a novel method for RBP-AS target prediction based on weighted data fusion with sparse matrix tri-factorization (WDFSMF in short) that simultaneously decomposes heterogeneous data source matrices into low-rank matrices to reveal hidden associations. WDFSMF can select and integrate data sources by assigning different weights to those sources, and these weights can be assigned automatically. In addition, WDFSMF can identify significant RBP complexes regulating AS events and eliminate noise and outliers from the data. Our proposed method achieves an area under the receiver operating characteristic curve (AUC) of $90.78\%$, which shows that WDFSMF can effectively predict RBP-AS event associations with higher accuracy compared with previous methods. Furthermore, this study identifies significant RBPs as complexes for AS events during EMT and provides solid ground for further investigation into RNA regulation during EMT and metastasis. WDFSMF is a general data fusion framework, and as such it can also be adapted to predict associations between other biological entities.


Assuntos
Processamento Alternativo , Biologia Computacional/métodos , Transição Epitelial-Mesenquimal/genética , Regulação Neoplásica da Expressão Gênica , Proteínas de Ligação a RNA/metabolismo , Algoritmos , Biologia Computacional/normas , Humanos , Curva ROC , Reprodutibilidade dos Testes , Software
6.
Brief Bioinform ; 22(5)2021 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-33517359

RESUMO

MOTIVATION: The developmental process of epithelial-mesenchymal transition (EMT) is abnormally activated during breast cancer metastasis. Transcriptional regulatory networks that control EMT have been well studied; however, alternative RNA splicing plays a vital regulatory role during this process and the regulating mechanism needs further exploration. Because of the huge cost and complexity of biological experiments, the underlying mechanisms of alternative splicing (AS) and associated RNA-binding proteins (RBPs) that regulate the EMT process remain largely unknown. Thus, there is an urgent need to develop computational methods for predicting potential RBP-AS event associations during EMT. RESULTS: We developed a novel model for RBP-AS target prediction during EMT that is based on inductive matrix completion (RAIMC). Integrated RBP similarities were calculated based on RBP regulating similarity, and RBP Gaussian interaction profile (GIP) kernel similarity, while integrated AS event similarities were computed based on AS event module similarity and AS event GIP kernel similarity. Our primary objective was to complete missing or unknown RBP-AS event associations based on known associations and on integrated RBP and AS event similarities. In this paper, we identify significant RBPs for AS events during EMT and discuss potential regulating mechanisms. Our computational results confirm the effectiveness and superiority of our model over other state-of-the-art methods. Our RAIMC model achieved AUC values of 0.9587 and 0.9765 based on leave-one-out cross-validation (CV) and 5-fold CV, respectively, which are larger than the AUC values from the previous models. RAIMC is a general matrix completion framework that can be adopted to predict associations between other biological entities. We further validated the prediction performance of RAIMC on the genes CD44 and MAP3K7. RAIMC can identify the related regulating RBPs for isoforms of these two genes. AVAILABILITY AND IMPLEMENTATION: The source code for RAIMC is available at https://github.com/yushanqiu/RAIMC. CONTACT: zouquan@nclab.net online.


Assuntos
Processamento Alternativo , Neoplasias da Mama , Transição Epitelial-Mesenquimal/genética , Regulação Neoplásica da Expressão Gênica , Redes Reguladoras de Genes , Proteínas de Neoplasias , Proteínas de Ligação a RNA , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Feminino , Humanos , Proteínas de Neoplasias/genética , Proteínas de Neoplasias/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas de Ligação a RNA/metabolismo
7.
IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2714-2723, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-32386162

RESUMO

Clustering tumor metastasis samples from gene expression data at the whole genome level remains an arduous challenge, in particular, when the number of experimental samples is small and the number of genes is huge. We focus on the prediction of the epithelial-mesenchymal transition (EMT), which is an underlying mechanism of tumor metastasis, here, rather than tumor metastasis itself, to avoid confounding effects of uncertainties derived from various factors. In this paper, we propose a novel model in predicting EMT based on multidimensional scaling (MDS) strategies and integrating entropy and random matrix detection strategies to determine the optimal reduced number of dimension in low dimensional space. We verified our proposed model with the gene expression data for EMT samples of breast cancer and the experimental results demonstrated the superiority over state-of-the-art clustering methods. Furthermore, we developed a novel feature extraction method for selecting the significant genes and predicting the tumor metastasis. The source code is available at "https://github.com/yushanqiu/yushan.qiu-szu.edu.cn".


Assuntos
Biologia Computacional/métodos , Transição Epitelial-Mesenquimal/genética , Análise de Escalonamento Multidimensional , Aprendizado de Máquina não Supervisionado , Neoplasias da Mama/genética , Neoplasias da Mama/patologia , Análise por Conglomerados , Feminino , Humanos , Metástase Neoplásica/genética , Transcriptoma/genética
8.
RNA ; 26(9): 1257-1267, 2020 09.
Artigo em Inglês | MEDLINE | ID: mdl-32467311

RESUMO

During breast cancer metastasis, the developmental process epithelial-mesenchymal transition (EMT) is abnormally activated. Transcriptional regulatory networks controlling EMT are well-studied; however, alternative RNA splicing also plays a critical regulatory role during this process. A comprehensive understanding of alternative splicing (AS) and the RNA binding proteins (RBPs) that regulate it during EMT and their impact on breast cancer remains largely unknown. In this study, we annotated AS in the breast cancer TCGA data set and identified an AS signature that is capable of distinguishing epithelial and mesenchymal states of the tumors. This AS signature contains 25 AS events, among which nine showed increased exon inclusion and 16 showed exon skipping during EMT. This AS signature accurately assigns the EMT status of cells in the CCLE data set and robustly predicts patient survival. We further developed an effective computational method using bipartite networks to identify RBP-AS networks during EMT. This network analysis revealed the complexity of RBP regulation and nominated previously unknown RBPs that regulate EMT-associated AS events. This study highlights the importance of global AS regulation during EMT in cancer progression and paves the way for further investigation into RNA regulation in EMT and metastasis.


Assuntos
Processamento Alternativo/genética , Neoplasias da Mama/genética , Transição Epitelial-Mesenquimal/genética , RNA/genética , Linhagem Celular Tumoral , Éxons/genética , Feminino , Regulação Neoplásica da Expressão Gênica/genética , Redes Reguladoras de Genes/genética , Humanos , Células MCF-7 , Proteínas de Ligação a RNA/genética
9.
Artif Intell Med ; 95: 96-103, 2019 04.
Artigo em Inglês | MEDLINE | ID: mdl-30352711

RESUMO

Identifying tumor metastasis signatures from gene expression data at the whole genome level remains an arduous challenge, particularly so when the number of genes is huge and the number of experimental samples is small. We focus on the prediction of the epithelial-mesenchymal transition (EMT), which is an underlying mechanism of tumor metastasis, here, rather than on tumor metastasis itself, to avoid confounding effects of uncertainties derived from various factors. We apply an extended LASSO model, L1/2-regularization model, as a feature selector, to identify significant RNA-binding proteins (RBPs) that contribute to regulating the EMT. We find that the L1/2-regularization model significantly outperforms LASSO in the EMT regulation problem. Furthermore, remarkable improvement in L1/2-regularization model classification performance can be achieved by incorporating extra information, specifically correlation values. We demonstrate that the L1/2-regularization model is applicable for identifying significant RBPs in biological research. Identified RBPs will facilitate study of the underlying mechanisms of the EMT.


Assuntos
Transição Epitelial-Mesenquimal , Proteínas de Ligação a RNA/fisiologia , Algoritmos , Linhagem Celular Tumoral , Humanos , Modelos Biológicos
10.
RNA ; 24(10): 1326-1338, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30042172

RESUMO

The epithelial-mesenchymal transition (EMT) is a fundamental developmental process that is abnormally activated in cancer metastasis. Dynamic changes in alternative splicing occur during EMT. ESRP1 and hnRNPM are splicing regulators that promote an epithelial splicing program and a mesenchymal splicing program, respectively. The functional relationships between these splicing factors in the genome scale remain elusive. Comparing alternative splicing targets of hnRNPM and ESRP1 revealed that they coregulate a set of cassette exon events, with the majority showing discordant splicing regulation. Discordant splicing events regulated by hnRNPM show a positive correlation with splicing during EMT; however, concordant events do not, indicating the role of hnRNPM in regulating alternative splicing during EMT is more complex than previously understood. Motif enrichment analysis near hnRNPM-ESRP1 coregulated exons identifies guanine-uridine rich motifs downstream from hnRNPM-repressed and ESRP1-enhanced exons, supporting a general model of competitive binding to these cis-elements to antagonize alternative splicing. The set of coregulated exons are enriched in genes associated with cell migration and cytoskeletal reorganization, which are pathways associated with EMT. Splicing levels of coregulated exons are associated with breast cancer patient survival and correlate with gene sets involved in EMT and breast cancer subtyping. This study identifies complex modes of interaction between hnRNPM and ESRP1 in regulation of splicing in disease-relevant contexts.


Assuntos
Processamento Alternativo , Transição Epitelial-Mesenquimal/genética , Regulação da Expressão Gênica , Ribonucleoproteínas Nucleares Heterogêneas Grupo M/metabolismo , Proteínas de Ligação a RNA/metabolismo , Sítios de Ligação , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Neoplasias da Mama/mortalidade , Linhagem Celular Tumoral , Éxons , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Motivos de Nucleotídeos , Prognóstico , Ligação Proteica , Reprodutibilidade dos Testes
11.
IET Syst Biol ; 8(4): 162-8, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-25075529

RESUMO

Pancreatic cancer is a devastating disease and predicting the status of the patients becomes an important and urgent issue. The authors explore the applicability of inductive logic programming (ILP) method in the disease and show that the accumulated clinical laboratory data can be used to predict disease characteristics, and this will contribute to the selection of therapeutic modalities of pancreatic cancer. The availability of a large amount of clinical laboratory data provides clues to aid in the knowledge discovery of diseases. In predicting the differentiation of tumour and the status of lymph node metastasis in pancreatic cancer, using the ILP model, three rules are developed that are consistent with descriptions in the literature. The rules that are identified are useful to detect the differentiation of tumour and the status of lymph node metastasis in pancreatic cancer and therefore contributed significantly to the decision of therapeutic strategies. In addition, the proposed method is compared with the other typical classification techniques and the results further confirm the superiority and merit of the proposed method.


Assuntos
Inteligência Artificial , Biomarcadores Tumorais/sangue , Sistemas de Apoio a Decisões Clínicas , Diagnóstico por Computador/métodos , Modelos Logísticos , Neoplasias Pancreáticas/diagnóstico , Simulação por Computador , Técnicas de Apoio para a Decisão , Humanos , Metástase Linfática , Neoplasias Pancreáticas/sangue , Reprodutibilidade dos Testes , Medição de Risco/métodos , Sensibilidade e Especificidade
12.
BMC Syst Biol ; 8 Suppl 1: S7, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24565276

RESUMO

BACKGROUND: Boolean network (BN) is a mathematical model for genetic network and control of genetic networks has become an important issue owing to their potential application in the field of drug discovery and treatment of intractable diseases. Early researches have focused primarily on the analysis of attractor control for a randomly generated BN. However, one may also consider how anti-cancer drugs act in both normal and cancer cells. Thus, the development of controls for multiple BNs is an important and interesting challenge. RESULTS: In this article, we formulate three novel problems about attractor control for two BNs (i.e., normal cell and cancer cell). The first is about finding a control that can significantly damage cancer cells but has a limited damage to normal cells. The second is about finding a control for normal cells with a guaranteed damaging effect on cancer cells. Finally, we formulate a definition for finding a control for cancer cells with limited damaging effect on normal cells. We propose integer programming-based methods for solving these problems in a unified manner, and we conduct computational experiments to illustrate the efficiency and the effectiveness of our method for our multiple-BN control problems. CONCLUSIONS: We present three novel control problems for multiple BNs that are realistic control models for gene regulation networks and adopt an integer programming approach to address these problems. Experimental results indicate that our proposed method is useful and effective for moderate size BNs.


Assuntos
Redes Reguladoras de Genes , Modelos Genéticos , Neoplasias/genética , Algoritmos , Biologia Computacional , Neoplasias/patologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA