Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 54
Filtrar
1.
Med Image Anal ; 93: 103103, 2024 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-38368752

RESUMEN

Accurate prognosis prediction for nasopharyngeal carcinoma based on magnetic resonance (MR) images assists in the guidance of treatment intensity, thus reducing the risk of recurrence and death. To reduce repeated labor and sufficiently explore domain knowledge, aggregating labeled/annotated data from external sites enables us to train an intelligent model for a clinical site with unlabeled data. However, this task suffers from the challenges of incomplete multi-modal examination data fusion and image data heterogeneity among sites. This paper proposes a cross-site survival analysis method for prognosis prediction of nasopharyngeal carcinoma from domain adaptation viewpoint. Utilizing a Cox model as the basic framework, our method equips it with a cross-attention based multi-modal fusion regularization. This regularization model effectively fuses the multi-modal information from multi-parametric MR images and clinical features onto a domain-adaptive space, despite the absence of some modalities. To enhance the feature discrimination, we also extend the contrastive learning technique to censored data cases. Compared with the conventional approaches which directly deploy a trained survival model in a new site, our method achieves superior prognosis prediction performance in cross-site validation experiments. These results highlight the key role of cross-site adaptability of our method and support its value in clinical practice.


Asunto(s)
Aprendizaje , Neoplasias Nasofaríngeas , Humanos , Carcinoma Nasofaríngeo/diagnóstico por imagen , Pronóstico , Neoplasias Nasofaríngeas/diagnóstico por imagen
2.
IEEE Trans Pattern Anal Mach Intell ; 45(4): 4198-4213, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-35830411

RESUMEN

As a fundamental manner for learning and cognition, transfer learning has attracted widespread attention in recent years. Typical transfer learning tasks include unsupervised domain adaptation (UDA) and few-shot learning (FSL), which both attempt to sufficiently transfer discriminative knowledge from the training environment to the test environment to improve the model's generalization performance. Previous transfer learning methods usually ignore the potential conditional distribution shift between environments. This leads to the discriminability degradation in the test environments. Therefore, how to construct a learnable and interpretable metric to measure and then reduce the gap between conditional distributions is very important in the literature. In this article, we design the Conditional Kernel Bures (CKB) metric for characterizing conditional distribution discrepancy, and derive an empirical estimation with convergence guarantee. CKB provides a statistical and interpretable approach, under the optimal transportation framework, to understand the knowledge transfer mechanism. It is essentially an extension of optimal transportation from the marginal distributions to the conditional distributions. CKB can be used as a plug-and-play module and placed onto the loss layer in deep networks, thus, it plays the bottleneck role in representation learning. From this perspective, the new method with network architecture is abbreviated as BuresNet, and it can be used extract conditional invariant features for both UDA and FSL tasks. BuresNet can be trained in an end-to-end manner. Extensive experiment results on several benchmark datasets validate the effectiveness of BuresNet.

3.
Bioinformatics ; 38(5): 1353-1360, 2022 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-34864881

RESUMEN

MOTIVATION: Drug repositioning that aims to find new indications for existing drugs has been an efficient strategy for drug discovery. In the scenario where we only have confirmed disease-drug associations as positive pairs, a negative set of disease-drug pairs is usually constructed from the unknown disease-drug pairs in previous studies, where we do not know whether drugs and diseases can be associated, to train a model for disease-drug association prediction (drug repositioning). Drugs and diseases in these negative pairs can potentially be associated, but most studies have ignored them. RESULTS: We present a method, springD2A, to capture the uncertainty in the negative pairs, and to discriminate between positive and unknown pairs because the former are more reliable. In springD2A, we introduce a spring-like penalty for the loss of negative pairs, which is strong if they are too close in a unit sphere, but mild if they are at a moderate distance. We also design a sequential sampling in which the probability of an unknown disease-drug pair sampled as negative is proportional to its score predicted as positive. Multiple models are learned during sequential sampling, and we adopt parameter- and feature-based ensemble schemes to boost performance. Experiments show springD2A is an effective tool for drug-repositioning. AVAILABILITY AND IMPLEMENTATION: A python implementation of springD2A and datasets used in this study are available at https://github.com/wangyuanhao/springD2A. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Reposicionamiento de Medicamentos , Incertidumbre , Probabilidad , Descubrimiento de Drogas
4.
IEEE Trans Pattern Anal Mach Intell ; 44(3): 1653-1669, 2022 03.
Artículo en Inglés | MEDLINE | ID: mdl-32749963

RESUMEN

Unsupervised domain adaptation is effective in leveraging rich information from a labeled source domain to an unlabeled target domain. Though deep learning and adversarial strategy made a significant breakthrough in the adaptability of features, there are two issues to be further studied. First, hard-assigned pseudo labels on the target domain are arbitrary and error-prone, and direct application of them may destroy the intrinsic data structure. Second, batch-wise training of deep learning limits the characterization of the global structure. In this paper, a Riemannian manifold learning framework is proposed to achieve transferability and discriminability simultaneously. For the first issue, this framework establishes a probabilistic discriminant criterion on the target domain via soft labels. Based on pre-built prototypes, this criterion is extended to a global approximation scheme for the second issue. Manifold metric alignment is adopted to be compatible with the embedding space. The theoretical error bounds of different alignment metrics are derived for constructive guidance. The proposed method can be used to tackle a series of variants of domain adaptation problems, including both vanilla and partial settings. Extensive experiments have been conducted to investigate the method and a comparative study shows the superiority of the discriminative manifold learning framework.


Asunto(s)
Algoritmos
5.
IEEE Trans Cybern ; 52(8): 8352-8365, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-33544687

RESUMEN

For a broad range of applications, hyperspectral image (HSI) classification is a hot topic in remote sensing, and convolutional neural network (CNN)-based methods are drawing increasing attention. However, to train millions of parameters in CNN requires a large number of labeled training samples, which are difficult to collect. A conventional Gabor filter can effectively extract spatial information with different scales and orientations without training, but it may be missing some important discriminative information. In this article, we propose the Gabor ensemble filter (GEF), a new convolutional filter to extract deep features for HSI with fewer trainable parameters. GEF filters each input channel by some fixed Gabor filters and learnable filters simultaneously, then reduces the dimensions by some learnable 1×1 filters to generate the output channels. The fixed Gabor filters can extract common features with different scales and orientations, while the learnable filters can learn some complementary features that Gabor filters cannot extract. Based on GEF, we design a network architecture for HSI classification, which extracts deep features and can learn from limited training samples. In order to simultaneously learn more discriminative features and an end-to-end system, we propose to introduce the local discriminant structure for cross-entropy loss by combining the triplet hard loss. Results of experiments on three HSI datasets show that the proposed method has significantly higher classification accuracy than other state-of-the-art methods. Moreover, the proposed method is speedy for both training and testing.


Asunto(s)
Algoritmos , Redes Neurales de la Computación
6.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34607358

RESUMEN

The discovery of cancer subtypes has become much-researched topic in oncology. Dividing cancer patients into subtypes can provide personalized treatments for heterogeneous patients. High-throughput technologies provide multiple omics data for cancer subtyping. Integration of multi-view data is used to identify cancer subtypes in many computational methods, which obtain different subtypes for the same cancer, even using the same multi-omics data. To a certain extent, these subtypes from distinct methods are related, which may have certain guiding significance for cancer subtyping. It is a challenge to effectively utilize the valuable information of distinct subtypes to produce more accurate and reliable subtypes. A weighted ensemble sparse latent representation (subtype-WESLR) is proposed to detect cancer subtypes on heterogeneous omics data. Using a weighted ensemble strategy to fuse base clustering obtained by distinct methods as prior knowledge, subtype-WESLR projects each sample feature profile from each data type to a common latent subspace while maintaining the local structure of the original sample feature space and consistency with the weighted ensemble and optimizes the common subspace by an iterative method to identify cancer subtypes. We conduct experiments on various synthetic datasets and eight public multi-view datasets from The Cancer Genome Atlas. The results demonstrate that subtype-WESLR is better than competing methods by utilizing the integration of base clustering of exist methods for more precise subtypes.


Asunto(s)
Algoritmos , Neoplasias , Análisis por Conglomerados , Humanos , Neoplasias/genética
7.
Brief Bioinform ; 23(1)2022 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-34607360

RESUMEN

Learning node representation is a fundamental problem in biological network analysis, as compact representation features reveal complicated network structures and carry useful information for downstream tasks such as link prediction and node classification. Recently, multiple networks that profile objects from different aspects are increasingly accumulated, providing the opportunity to learn objects from multiple perspectives. However, the complex common and specific information across different networks pose challenges to node representation methods. Moreover, ubiquitous noise in networks calls for more robust representation. To deal with these problems, we present a representation learning method for multiple biological networks. First, we accommodate the noise and spurious edges in networks using denoised diffusion, providing robust connectivity structures for the subsequent representation learning. Then, we introduce a graph regularized integration model to combine refined networks and compute common representation features. By using the regularized decomposition technique, the proposed model can effectively preserve the common structural property of different networks and simultaneously accommodate their specific information, leading to a consistent representation. A simulation study shows the superiority of the proposed method on different levels of noisy networks. Three network-based inference tasks, including drug-target interaction prediction, gene function identification and fine-grained species categorization, are conducted using representation features learned from our method. Biological networks at different scales and levels of sparsity are involved. Experimental results on real-world data show that the proposed method has robust performance compared with alternatives. Overall, by eliminating noise and integrating effectively, the proposed method is able to learn useful representations from multiple biological networks.


Asunto(s)
Aprendizaje , Redes Neurales de la Computación , Simulación por Computador , Difusión
8.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33822879

RESUMEN

With diverse types of omics data widely available, many computational methods have been recently developed to integrate these heterogeneous data, providing a comprehensive understanding of diseases and biological mechanisms. But most of them hardly take noise effects into account. Data-specific patterns unique to data types also make it challenging to uncover the consistent patterns and learn a compact representation of multi-omics data. Here we present a multi-omics integration method considering these issues. We explicitly model the error term in data reconstruction and simultaneously consider noise effects and data-specific patterns. We utilize a denoised network regularization in which we build a fused network using a denoising procedure to suppress noise effects and data-specific patterns. The error term collaborates with the denoised network regularization to capture data-specific patterns. We solve the optimization problem via an inexact alternating minimization algorithm. A comparative simulation study shows the method's superiority at discovering common patterns among data types at three noise levels. Transcriptomics-and-epigenomics integration, in seven cancer cohorts from The Cancer Genome Atlas, demonstrates that the learned integrative representation extracted in an unsupervised manner can depict survival information. Specially in liver hepatocellular carcinoma, the learned integrative representation attains average Harrell's C-index of 0.78 in 10 times 3-fold cross-validation for survival prediction, which far exceeds competing methods, and we discover an aggressive subtype in liver hepatocellular carcinoma with this latent representation, which is validated by an external dataset GSE14520. We also show that DeFusion is applicable to the integration of other omics types.


Asunto(s)
Carcinoma Hepatocelular/genética , Carcinoma Hepatocelular/mortalidad , Epigenómica/métodos , Neoplasias Hepáticas/genética , Neoplasias Hepáticas/mortalidad , Transcriptoma , Algoritmos , Teorema de Bayes , Estudios de Cohortes , Metilación de ADN/genética , Aprendizaje Profundo , Humanos , MicroARNs/genética , Pronóstico , ARN Mensajero/genética
9.
IEEE/ACM Trans Comput Biol Bioinform ; 18(6): 2891-2897, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33656995

RESUMEN

The identification of cancer subtypes is of great importance for understanding the heterogeneity of tumors and providing patients with more accurate diagnoses and treatments. However, it is still a challenge to effectively integrate multiple omics data to establish cancer subtypes. In this paper, we propose an unsupervised integration method, named weighted multi-view low rank representation (WMLRR), to identify cancer subtypes from multiple types of omics data. Given a group of patients described by multiple omics data matrices, we first learn a unified affinity matrix which encodes the similarities among patients by exploring the sparsity-consistent low-rank representations from the joint decompositions of multiple omics data matrices. Unlike existing subtype identification methods that treat each omics data matrix equally, we assign a weight to each omics data matrix and learn these weights automatically through the optimization process. Finally, we apply spectral clustering on the learned affinity matrix to identify cancer subtypes. Experiment results show that the survival times between our identified cancer subtypes are significantly different, and our predicted survivals are more accurate than other state-of-the-art methods. In addition, some clinical analyses of the diseases also demonstrate the effectiveness of our method in identifying molecular subtypes with biological significance and clinical relevance.


Asunto(s)
Biología Computacional/métodos , Neoplasias , Aprendizaje Automático no Supervisado , Algoritmos , Análisis por Conglomerados , Metilación de ADN/genética , Humanos , Neoplasias/clasificación , Neoplasias/genética , Neoplasias/mortalidad , Transcriptoma/genética
10.
IEEE Trans Cybern ; 51(2): 521-533, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-31059466

RESUMEN

Establishing correspondence between two given geometrical graph structures is an important problem in computer vision and pattern recognition. In this paper, we propose a robust graph matching (RGM) model to improve the effectiveness and robustness on the matching graphs with deformations, rotations, outliers, and noise. First, we embed the joint geometric transformation into the graph matching model, which performs unary matching over graph nodes and local structure matching over graph edges simultaneously. Then, the L2,1 -norm is used as the similarity metric in the presented RGM to enhance the robustness. Finally, we derive an objective function which can be solved by an effective optimization algorithm, and theoretically prove the convergence of the proposed algorithm. Extensive experiments on various graph matching tasks, such as outliers, rotations, and deformations show that the proposed RGM model achieves competitive performance compared to the existing methods.

11.
IEEE Trans Cybern ; 51(4): 2166-2177, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-31880576

RESUMEN

Domain adaptation (DA) and transfer learning with statistical property description is very important in image analysis and data classification. This article studies the domain adaptive feature representation problem for the heterogeneous data, of which both the feature dimensions and the sample distributions across domains are so different that their features cannot be matched directly. To transfer the discriminant information efficiently from the source domain to the target domain, and then enhance the classification performance for the target data, we first introduce two projection matrices specified for different domains to transform the heterogeneous features into a shared space. We then propose a joint kernel regression model to learn the regression variable, which is called feature translator in this article. The novelty focuses on the exploration of optimal experimental design (OED) to deal with the heterogeneous and nonlinear DA by seeking the covariance structured feature translators (CSFTs). An approximate and efficient method is proposed to compute the optimal data projections. Comprehensive experiments are conducted to validate the effectiveness and efficacy of the proposed model. The results show the state-of-the-art performance of our method in heterogeneous DA.

12.
IEEE Trans Cybern ; 51(4): 2006-2018, 2021 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-31150354

RESUMEN

Conditional maximum mean discrepancy (CMMD) can capture the discrepancy between conditional distributions by drawing support from nonlinear kernel functions; thus, it has been successfully used for pattern classification. However, CMMD does not work well on complex distributions, especially when the kernel function fails to correctly characterize the difference between intraclass similarity and interclass similarity. In this paper, a new kernel learning method is proposed to improve the discrimination performance of CMMD. It can be operated with deep network features iteratively and thus denoted as KLN for abbreviation. The CMMD loss and an autoencoder (AE) are used to learn an injective function. By considering the compound kernel, that is, the injective function with a characteristic kernel, the effectiveness of CMMD for data category description is enhanced. KLN can simultaneously learn a more expressive kernel and label prediction distribution; thus, it can be used to improve the classification performance in both supervised and semisupervised learning scenarios. In particular, the kernel-based similarities are iteratively learned on the deep network features, and the algorithm can be implemented in an end-to-end manner. Extensive experiments are conducted on four benchmark datasets, including MNIST, SVHN, CIFAR-10, and CIFAR-100. The results indicate that KLN achieves the state-of-the-art classification performance.

13.
Mol Omics ; 16(5): 465-473, 2020 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-32572422

RESUMEN

The development of single-cell RNA-sequencing (scRNA-seq) technologies brings tremendous opportunities for quantitative research and analyses at the cellular level. In particular, as a crucial task of scRNA-seq analysis, single cell clustering shines a light on natural groupings of cells to give new insights into the biological mechanisms and disease studies. However, it remains a challenge to identify cell clusters from lots of cell mixtures effectively and accurately. In this paper, we propose a novel adaptive joint clustering framework, named the low-rank self-representation K-means method (LRSK), to learn the data representation matrix and cluster indicator matrix jointly from scRNA-seq data. Specifically, instead of calculating the similarities among cells from the original data, we seek a low-rank representation of the original data to better reflect the underlying relationships among cells. Moreover, an Augmented Lagrangian Multiplier (ALM) based optimization algorithm is adopted to solve this problem. Experimental results on various scRNA-seq datasets and case studies demonstrate that our method performs better than other state-of-the-art single cell clustering algorithms. The analysis of unlabeled large single-cell liver cancer sequencing data further shows that our prediction results are more reasonable and interpretable.


Asunto(s)
Algoritmos , Análisis de Secuencia de ARN , Análisis de la Célula Individual , Análisis por Conglomerados , Regulación de la Expresión Génica
14.
IEEE Trans Neural Netw Learn Syst ; 31(4): 1417-1424, 2020 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-31247579

RESUMEN

As a powerful approach for exploratory data analysis, unsupervised clustering is a fundamental task in computer vision and pattern recognition. Many clustering algorithms have been developed, but most of them perform unsatisfactorily on the data with complex structures. Recently, adversarial autoencoder (AE) (AAE) shows effectiveness on tackling such data by combining AE and adversarial training, but it cannot effectively extract classification information from the unlabeled data. In this brief, we propose dual AAE (Dual-AAE) which simultaneously maximizes the likelihood function and mutual information between observed examples and a subset of latent variables. By performing variational inference on the objective function of Dual-AAE, we derive a new reconstruction loss which can be optimized by training a pair of AEs. Moreover, to avoid mode collapse, we introduce the clustering regularization term for the category variable. Experiments on four benchmarks show that Dual-AAE achieves superior performance over state-of-the-art clustering methods. In addition, by adding a reject option, the clustering accuracy of Dual-AAE can reach that of supervised CNN algorithms. Dual-AAE can also be used for disentangling style and content of images without using supervised information.

15.
Artículo en Inglés | MEDLINE | ID: mdl-31765312

RESUMEN

Image set recognition has been widely applied in many practical problems like real-time video retrieval and image caption tasks. Due to its superior performance, it has grown into a significant topic in recent years. However, images with complicated variations, e.g., postures and human ages, are difficult to address, as these variations are continuous and gradual with respect to image appearance. Consequently, the crucial point of image set recognition is to mine the intrinsic connection or structural information from the image batches with variations. In this work, a Discriminant Residual Analysis (DRA) method is proposed to improve the classification performance by discovering discriminant features in related and unrelated groups. Specifically, DRA attempts to obtain a powerful projection which casts the residual representations into a discriminant subspace. Such a projection subspace is expected to magnify the useful information of the input space as much as possible, then the relation between the training set and the test set described by the given metric or distance will be more precise in the discriminant subspace. We also propose a nonfeasance strategy by defining another approach to construct the unrelated groups, which help to reduce furthermore the cost of sampling errors. Two regularization approaches are used to deal with the probable small sample size problem. Extensive experiments are conducted on benchmark databases, and the results show superiority and efficiency of the new methods.

16.
IEEE Trans Neural Netw Learn Syst ; 29(12): 6214-6226, 2018 12.
Artículo en Inglés | MEDLINE | ID: mdl-29993753

RESUMEN

We propose a set of novel radial basis functions with adaptive input and composite trend representation (AICTR) for portfolio selection (PS). Trend representation of asset price is one of the main information to be exploited in PS. However, most state-of-the-art trend representation-based systems exploit only one kind of trend information and lack effective mechanisms to construct a composite trend representation. The proposed system exploits a set of RBFs with multiple trend representations, which improves the effectiveness and robustness in price prediction. Moreover, the input of the RBFs automatically switches to the best trend representation according to the recent investing performance of different price predictions. We also propose a novel objective to combine these RBFs and select the portfolio. Extensive experiments on six benchmark data sets (including a new challenging data set that we propose) from different real-world stock markets indicate that the proposed RBFs effectively combine different trend representations and AICTR achieves state-of-the-art investing performance and risk control. Besides, AICTR withstands the reasonable transaction costs and runs fast; hence, it is applicable to real-world financial environments.

17.
IEEE Trans Cybern ; 48(4): 1124-1135, 2018 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28368841

RESUMEN

Local feature descriptor plays a key role in different image classification applications. Some of these methods such as local binary pattern and image gradient orientations have been proven effective to some extent. However, such traditional descriptors which only utilize single-type features, are deficient to capture the edges and orientations information and intrinsic structure information of images. In this paper, we propose a kernel embedding multiorientation local pattern (MOLP) to address this problem. For a given image, it is first transformed by gradient operators in local regions, which generate multiorientation gradient images containing edges and orientations information of different directions. Then the histogram feature which takes into account the sign component and magnitude component, is extracted to form the refined feature from each orientation gradient image. The refined feature captures more information of the intrinsic structure, and is effective for image representation and classification. Finally, the multiorientation refined features are automatically fused in the kernel embedding discriminant subspace learning model. The extensive experiments on various image classification tasks, such as face recognition, texture classification, object categorization, and palmprint recognition show that MOLP could achieve competitive performance with those state-of-the art methods.

18.
IEEE Trans Neural Netw Learn Syst ; 29(7): 2823-2832, 2018 07.
Artículo en Inglés | MEDLINE | ID: mdl-28600267

RESUMEN

We propose a novel linear learning system based on the peak price tracking (PPT) strategy for portfolio selection (PS). Recently, the topic of tracking control attracts intensive attention and some novel models are proposed based on backstepping methods, such that the system output tracks a desired trajectory. The proposed system has a similar evolution with a transform function that aggressively tracks the increasing power of different assets. As a result, the better performing assets will receive more investment. The proposed PPT objective can be formulated as a fast backpropagation algorithm, which is suitable for large-scale and time-limited applications, such as high-frequency trading. Extensive experiments on several benchmark data sets from diverse real financial markets show that PPT outperforms other state-of-the-art systems in computational time, cumulative wealth, and risk-adjusted metrics. It suggests that PPT is effective and even more robust than some defensive systems in PS.

19.
IEEE Trans Neural Netw Learn Syst ; 28(5): 1082-1094, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-26890929

RESUMEN

A sparse representation classifier (SRC) and a kernel discriminant analysis (KDA) are two successful methods for face recognition. An SRC is good at dealing with occlusion, while a KDA does well in suppressing intraclass variations. In this paper, we propose kernel extended dictionary (KED) for face recognition, which provides an efficient way for combining KDA and SRC. We first learn several kernel principal components of occlusion variations as an occlusion model, which can represent the possible occlusion variations efficiently. Then, the occlusion model is projected by KDA to get the KED, which can be computed via the same kernel trick as new testing samples. Finally, we use structured SRC for classification, which is fast as only a small number of atoms are appended to the basic dictionary, and the feature dimension is low. We also extend KED to multikernel space to fuse different types of features at kernel level. Experiments are done on several large-scale data sets, demonstrating that not only does KED get impressive results for nonoccluded samples, but it also handles the occlusion well without overfitting, even with a single gallery sample per subject.

20.
BMC Bioinformatics ; 17(1): 358, 2016 Sep 09.
Artículo en Inglés | MEDLINE | ID: mdl-27612563

RESUMEN

BACKGROUND: Several recent studies have used the Minimum Dominating Set (MDS) model to identify driver nodes, which provide the control of the underlying networks, in protein interaction networks. There may exist multiple MDS configurations in a given network, thus it is difficult to determine which one represents the real set of driver nodes. Because these previous studies only focus on static networks and ignore the contextual information on particular tissues, their findings could be insufficient or even be misleading. RESULTS: In this study, we develop a Collective-Influence-corrected Minimum Dominating Set (CI-MDS) model which takes into account the collective influence of proteins. By integrating molecular expression profiles and static protein interactions, 16 tissue-specific networks are established as well. We then apply the CI-MDS model to each tissue-specific network to detect MDS proteins. It generates almost the same MDSs when it is solved using different optimization algorithms. In addition, we classify MDS proteins into Tissue-Specific MDS (TS-MDS) proteins and HouseKeeping MDS (HK-MDS) proteins based on the number of tissues in which they are expressed and identified as MDS proteins. Notably, we find that TS-MDS proteins and HK-MDS proteins have significantly different topological and functional properties. HK-MDS proteins are more central in protein interaction networks, associated with more functions, evolving more slowly and subjected to a greater number of post-translational modifications than TS-MDS proteins. Unlike TS-MDS proteins, HK-MDS proteins significantly correspond to essential genes, ageing genes, virus-targeted proteins, transcription factors and protein kinases. Moreover, we find that besides HK-MDS proteins, many TS-MDS proteins are also linked to disease related genes, suggesting the tissue specificity of human diseases. Furthermore, functional enrichment analysis reveals that HK-MDS proteins carry out universally necessary biological processes and TS-MDS proteins usually involve in tissue-dependent functions. CONCLUSIONS: Our study uncovers key features of TS-MDS proteins and HK-MDS proteins, and is a step forward towards a better understanding of the controllability of human interactomes.


Asunto(s)
Genes Esenciales , Especificidad de Órganos/genética , Mapas de Interacción de Proteínas , Envejecimiento/genética , Algoritmos , Evolución Molecular , Ontología de Genes , Humanos , Modelos Teóricos , Neoplasias/genética , Procesamiento Proteico-Postraduccional/genética , Virus/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...