Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Mais filtros

Base de dados
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Entropy (Basel) ; 21(3)2019 Mar 06.
Artigo em Inglês | MEDLINE | ID: mdl-33266969

RESUMO

Presently, many users are involved in multiple social networks. Identifying the same user in different networks, also known as anchor link prediction, becomes an important problem, which can serve numerous applications, e.g., cross-network recommendation, user profiling, etc. Previous studies mainly use hand-crafted structure features, which, if not carefully designed, may fail to reflect the intrinsic structure regularities. Moreover, most of the methods neglect the attribute information of social networks. In this paper, we propose a novel semi-supervised network-embedding model to address the problem. In the model, each node of the multiple networks is represented by a vector for anchor link prediction, which is learnt with awareness of observed anchor links as semi-supervised information, and topology structure and attributes as input. Experimental results on the real-world data sets demonstrate the superiority of the proposed model compared to state-of-the-art techniques.

2.
BMC Bioinformatics ; 15 Suppl 2: S9, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24564855

RESUMO

BACKGROUND: Automated assignment of functions to unknown proteins is one of the most important task in computational biology. The development of experimental methods for genome scale analysis of molecular interaction networks offers new ways to infer protein function from protein-protein interaction (PPI) network data. Existing techniques for collective classification (CC) usually increase accuracy for network data, wherein instances are interlinked with each other, using a large amount of labeled data for training. However, the labeled data are time-consuming and expensive to obtain. On the other hand, one can easily obtain large amount of unlabeled data. Thus, more sophisticated methods are needed to exploit the unlabeled data to increase prediction accuracy for protein function prediction. RESULTS: In this paper, we propose an effective Markov chain based CC algorithm (ICAM) to tackle the label deficiency problem in CC for interrelated proteins from PPI networks. Our idea is to model the problem using two distinct Markov chain classifiers to make separate predictions with regard to attribute features from protein data and relational features from relational information. The ICAM learning algorithm combines the results of the two classifiers to compute the ranks of labels to indicate the importance of a set of labels to an instance, and uses an ICA framework to iteratively refine the learning models for improving performance of protein function prediction from PPI networks in the paucity of labeled data. CONCLUSION: Experimental results on the real-world Yeast protein-protein interaction datasets show that our proposed ICAM method is better than the other ICA-type methods given limited labeled training data. This approach can serve as a valuable tool for the study of protein function prediction from PPI networks.


Assuntos
Algoritmos , Mapeamento de Interação de Proteínas/métodos , Proteínas/fisiologia , Cadeias de Markov
3.
BMC Genomics ; 15 Suppl 9: S17, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25521242

RESUMO

BACKGROUND: With the rapid accumulation of proteomic and genomic datasets in terms of genome-scale features and interaction networks through high-throughput experimental techniques, the process of manual predicting functional properties of the proteins has become increasingly cumbersome, and computational methods to automate this annotation task are urgently needed. Most of the approaches in predicting functional properties of proteins require to either identify a reliable set of labeled proteins with similar attribute features to unannotated proteins, or to learn from a fully-labeled protein interaction network with a large amount of labeled data. However, acquiring such labels can be very difficult in practice, especially for multi-label protein function prediction problems. Learning with only a few labeled data can lead to poor performance as limited supervision knowledge can be obtained from similar proteins or from connections between them. To effectively annotate proteins even in the paucity of labeled data, it is important to take advantage of all data sources that are available in this problem setting, including interaction networks, attribute feature information, correlations of functional labels, and unlabeled data. RESULTS: In this paper, we show that the underlying nature of predicting functional properties of proteins using various data sources of relational data is a typical collective classification (CC) problem in machine learning. The protein functional prediction task with limited annotation is then cast into a semi-supervised multi-label collective classification (SMCC) framework. As such, we propose a novel generative model based SMCC algorithm, called GM-SMCC, to effectively compute the label probability distributions of unannotated protein instances and predict their functional properties. To further boost the predicting performance, we extend the method in an ensemble manner, called EGM-SMCC, by utilizing multiple heterogeneous networks with various latent linkages constructed to explicitly model the relationships among the nodes for effectively propagate the supervision knowledge from labeled to unlabeled nodes. CONCLUSION: Experimental results on a yeast gene dataset predicting the functions and localization of proteins demonstrate the effectiveness of the proposed method. In the comparison, we find that the performances of the proposed algorithms are better than the other compared algorithms.


Assuntos
Inteligência Artificial , Genômica/métodos , Algoritmos , Anotação de Sequência Molecular , Probabilidade , Mapeamento de Interação de Proteínas , Leveduras/genética , Leveduras/metabolismo
4.
ScientificWorldJournal ; 2014: 497354, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24982961

RESUMO

Textual stream classification has become a realistic and challenging issue since large-scale, high-dimensional, and non-stationary streams with class imbalance have been widely used in various real-life applications. According to the characters of textual streams, it is technically difficult to deal with the classification of textual stream, especially in imbalanced environment. In this paper, we propose a new ensemble framework, clustering forest, for learning from the textual imbalanced stream with concept drift (CFIM). The CFIM is based on ensemble learning by integrating a set of clustering trees (CTs). An adaptive selection method, which flexibly chooses the useful CTs by the property of the stream, is presented in CFIM. In particular, to deal with the problem of class imbalance, we collect and reuse both rare-class instances and misclassified instances from the historical chunks. Compared to most existing approaches, it is worth pointing out that our approach assumes that both majority class and rareclass may suffer from concept drift. Thus the distribution of resampled instances is similar to the current concept. The effectiveness of CFIM is examined in five real-world textual streams under an imbalanced nonstationary environment. Experimental results demonstrate that CFIM achieves better performance than four state-of-the-art ensemble models.


Assuntos
Algoritmos , Inteligência Artificial
5.
Neural Netw ; 170: 535-547, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38043373

RESUMO

Anomaly detection in multivariate time series is of critical importance in many real-world applications, such as system maintenance and Internet monitoring. In this article, we propose a novel unsupervised framework called SVD-AE to conduct anomaly detection in multivariate time series. The core idea is to fuse the strengths of both SVD and autoencoder to fully capture complex normal patterns in multivariate time series. An asymmetric autoencoder architecture is proposed, where two encoders are used to capture features in time and variable dimensions and a shared decoder is used to generate reconstructions based on latent representations from both dimensions. A new regularization based on singular value decomposition theory is designed to force each encoder to learn features in the corresponding axis with mathematical supports delivered. A specific loss component is further proposed to align Fourier coefficients of inputs and reconstructions. It can preserve details of original inputs, leading to enhanced feature learning capability of the model. Extensive experiments on three real world datasets demonstrate the proposed algorithm can achieve better performance on multivariate time series anomaly detection tasks under highly unbalanced scenarios compared with baseline algorithms.


Assuntos
Algoritmos , Internet , Fatores de Tempo , Aprendizagem
6.
Neural Netw ; 174: 106233, 2024 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-38508045

RESUMO

Regional wind speed prediction is an important spatiotemporal prediction problem which is crucial for optimizing wind power utilization. Nevertheless, the complex dynamics of wind speed pose a formidable challenge to prediction tasks. The evolving dynamics of wind could be governed by underlying physical principles that can be described by partial differential equations (PDE). This study proposes a novel approach called PDE-assisted network (PaNet) for regional wind speed prediction. In PaNet, a new architecture is devised, incorporating both PDE-based dynamics (PDE dynamics) and unknown dynamics. Specifically, this architecture establishes interactions between the two dynamics, regulated by an inter-dynamics communication unit that controls interactions through attention gates. Additionally, recognizing the significance of the initial state for PDE dynamics, an adaptive frequency-gated unit is introduced to generate a suitable initial state for the PDE dynamics by selecting essential frequency components. To evaluate the predictive performance of PaNet, this study conducts comprehensive experiments on two real-world wind speed datasets. The experimental results indicated that the proposed method is superior to other baseline methods.


Assuntos
Redes Neurais de Computação , Vento
7.
BMC Genomics ; 14 Suppl 4: S2, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24268038

RESUMO

BACKGROUND: Identifying modules from time series biological data helps us understand biological functionalities of a group of proteins/genes interacting together and how responses of these proteins/genes dynamically change with respect to time. With rapid acquisition of time series biological data from different laboratories or databases, new challenges are posed for the identification task and powerful methods which are able to detect modules with integrative analysis are urgently called for. To accomplish such integrative analysis, we assemble multiple time series biological data into a higher-order form, e.g., a gene × condition × time tensor. It is interesting and useful to develop methods to identify modules from this tensor. RESULTS: In this paper, we present MultiFacTV, a new method to find modules from higher-order time series biological data. This method employs a tensor factorization objective function where a time-related total variation regularization term is incorporated. According to factorization results, MultiFacTV extracts modules that are composed of some genes, conditions and time-points. We have performed MultiFacTV on synthetic datasets and the results have shown that MultiFacTV outperforms existing methods EDISA and Metafac. Moreover, we have applied MultiFacTV to Arabidopsis thaliana root(shoot) tissue dataset represented as a gene × condition × time tensor of size 2395 × 9 × 6(3454 × 8 × 6), to Yeast dataset and Homo sapiens dataset represented as tensors of sizes 4425 × 6 × 6 and 2920 × 14 × 9 respectively. The results have shown that MultiFacTV indeed identifies some interesting modules in these datasets, which have been validated and explained by Gene Ontology analysis with DAVID or other analysis. CONCLUSION: Experimental results on both synthetic datasets and real datasets show that the proposed MultiFacTV is effective in identifying modules for higher-order time series biological data. It provides, compared to traditional non-integrative analysis methods, a more comprehensive and better view on biological process since modules composed of more than two types of biological variables could be identified and analyzed.


Assuntos
Algoritmos , Biologia Computacional/métodos , Genes Fúngicos , Genes de Plantas , Genoma Humano , Arabidopsis/genética , Bases de Dados Genéticas , Perfilação da Expressão Gênica , Ontologia Genética , Humanos , Fatores de Tempo
8.
IEEE Trans Neural Netw Learn Syst ; 34(4): 2079-2092, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-34487497

RESUMO

Heterogeneous domain adaptation (HDA) tackles the learning of cross-domain samples with both different probability distributions and feature representations. Most of the existing HDA studies focus on the single-source scenario. In reality, however, it is not uncommon to obtain samples from multiple heterogeneous domains. In this article, we study the multisource HDA problem and propose a conditional weighting adversarial network (CWAN) to address it. The proposed CWAN adversarially learns a feature transformer, a label classifier, and a domain discriminator. To quantify the importance of different source domains, CWAN introduces a sophisticated conditional weighting scheme to calculate the weights of the source domains according to the conditional distribution divergence between the source and target domains. Different from existing weighting schemes, the proposed conditional weighting scheme not only weights the source domains but also implicitly aligns the conditional distributions during the optimization process. Experimental results clearly demonstrate that the proposed CWAN performs much better than several state-of-the-art methods on four real-world datasets.

9.
Neural Netw ; 167: 533-550, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37696071

RESUMO

In wind speed prediction technologies, deep learning-based methods have achieved promising advantages. However, most existing methods focus on learning implicit knowledge in a data-driven manner but neglect some explicit knowledge from the physical theory of meteorological dynamics, failing to make stable and long-term predictions. In this paper, we explore introducing explicit physical knowledge into neural networks and propose Physical Equations Predictive Network (PEPNet) for multi-step wind speed predictions. In PEPNet, a new neural block called the Augmented Neural Barotropic Equations (ANBE) block is designed as its key component, which aims to capture the wind dynamics by combining barotropic primitive equations and deep neural networks. Specifically, the ANBE block adopts a two-branch structure to model wind dynamics, where one branch is physic-based and the other is data-driven-based. The physic-based branch constructs temporal partial derivatives of meteorological elements (including u-component wind, v-component wind, and geopotential height) in a new Neural Barotropic Equations Unit (NBEU). The NBEU is developed based on the barotropic primitive equations mode in numerical weather prediction (NWP). Besides, considering that the barotropic primitive mode is a crude assumption of atmospheric motion, another data-driven-based branch is developed in the ANBE block, which aims at capturing meteorological dynamics beyond barotropic primitive equations. Finally, the PEPNet follows a time-variant structure to enhance the model's capability to capture wind dynamics over time. To evaluate the predictive performance of PEPNet, we have conducted several experiments on two real-world datasets. Experimental results show that the proposed method outperforms the state-of-the-art techniques and achieve optimal performance.


Assuntos
Redes Neurais de Computação , Vento , Movimento (Física)
10.
IEEE Trans Pattern Anal Mach Intell ; 45(10): 12250-12268, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37216260

RESUMO

Few-shot learning (FSL) aims to recognize novel classes with few examples. Pre-training based methods effectively tackle the problem by pre-training a feature extractor and then fine-tuning it through the nearest centroid based meta-learning. However, results show that the fine-tuning step makes marginal improvements. In this paper, 1) we figure out the reason, i.e., in the pre-trained feature space, the base classes already form compact clusters while novel classes spread as groups with large variances, which implies that fine-tuning feature extractor is less meaningful; 2) instead of fine-tuning feature extractor, we focus on estimating more representative prototypes. Consequently, we propose a novel prototype completion based meta-learning framework. This framework first introduces primitive knowledge (i.e., class-level part or attribute annotations) and extracts representative features for seen attributes as priors. Second, a part/attribute transfer network is designed to learn to infer the representative features for unseen attributes as supplementary priors. Finally, a prototype completion network is devised to learn to complete prototypes with these priors. Moreover, to avoid the prototype completion error, we further develop a Gaussian based prototype fusion strategy that fuses the mean-based and completed prototypes by exploiting the unlabeled samples. At last, we also develop an economic prototype completion version for FSL, which does not need to collect primitive knowledge, for a fair comparison with existing FSL methods without external knowledge. Extensive experiments show that our method: i) obtains more accurate prototypes; ii) achieves superior performance on both inductive and transductive FSL settings.

11.
Neural Netw ; 161: 25-38, 2023 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-36735998

RESUMO

Traffic flow prediction (TFP) has attracted increasing attention with the development of smart city. In the past few years, neural network-based methods have shown impressive performance for TFP. However, most of previous studies fail to explicitly and effectively model the relationship between inflows and outflows. Consequently, these methods are usually uninterpretable and inaccurate. In this paper, we propose an interpretable local flow attention (LFA) mechanism for TFP, which yields three advantages. (1) LFA is flow-aware. Different from existing works, which blend inflows and outflows in the channel dimension, we explicitly exploit the correlations between flows with a novel attention mechanism. (2) LFA is interpretable. It is formulated by the truisms of traffic flow, and the learned attention weights can well explain the flow correlations. (3) LFA is efficient. Instead of using global spatial attention as in previous studies, LFA leverages the local mode. The attention query is only performed on the local related regions. This not only reduces computational cost but also avoids false attention. Based on LFA, we further develop a novel spatiotemporal cell, named LFA-ConvLSTM (LFA-based convolutional long short-term memory), to capture the complex dynamics in traffic data. Specifically, LFA-ConvLSTM consists of three parts. (1) A ConvLSTM module is utilized to learn flow-specific features. (2) An LFA module accounts for modeling the correlations between flows. (3) A feature aggregation module fuses the above two to obtain a comprehensive feature. Extensive experiments on two real-world datasets show that our method achieves a better prediction performance. We improve the RMSE metric by 3.2%-4.6%, and the MAPE metric by 6.2%-6.7%. Our LFA-ConvLSTM is also almost 32% faster than global self-attention ConvLSTM in terms of prediction time. Furthermore, we also present some visual results to analyze the learned flow correlations.


Assuntos
Aprendizagem , Memória de Longo Prazo , Redes Neurais de Computação
12.
Neural Netw ; 168: 256-271, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37774512

RESUMO

As a pixel-wise dense forecast task, video prediction is challenging due to its high computation complexity, dramatic future uncertainty, and extremely complicated spatial-temporal patterns. Many deep learning methods are proposed for the task, which bring up significant improvements. However, they focus on modeling short-term spatial-temporal dynamics and fail to sufficiently exploit long-term ones. As a result, the methods tend to deliver unsatisfactory performance for a long-term forecast requirement. In this article, we propose a novel unified memory network (UNIMEMnet) for long-term video prediction, which can effectively exploit long-term motion-appearance dynamics and unify the short-term spatial-temporal dynamics and long-term ones in an architecture. In the UNIMEMnet, a dual branch multi-scale memory module is carefully designed to extract and preserve long-term spatial-temporal patterns. In addition, a short-term spatial-temporal dynamics module and an alignment and fusion module are devised to capture and coordinate short-term motion-appearance dynamics with long-term ones from our designed memory module. Extensive experiments on five video prediction datasets from both synthetic and real-world scenarios are conducted, which validate the effectiveness and superiority of our proposed method UNIMEMnet over state-of-the-art methods.


Assuntos
Movimento (Física) , Incerteza
13.
Neural Netw ; 162: 147-161, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36907005

RESUMO

Regional wind speed prediction plays an important role in the development of wind power, which is usually recorded in the form of two orthogonal components, namely U-wind and V-wind. The regional wind speed has the characteristics of diverse variations, which are reflected in three aspects: (1) The spatially diverse variations of regional wind speed indicate that wind speed has different dynamic patterns at different positions; (2) The distinct variations between U-wind and V-wind denote that U-wind and V-wind at the same position exhibit different dynamic patterns; (3) The non-stationary variations of wind speed represent that the intermittent and chaotic nature of wind speed. In this paper, we propose a novel framework named Wind Dynamics Modeling Network (WDMNet) to model the diverse variations of regional wind speed and make accurate multi-step predictions. To jointly capture the spatially diverse variations and the distinct variations between U-wind and V-wind, WDMNet leverages a new neural block called Involution Gated Recurrent Unit Partial Differential Equation (Inv-GRU-PDE) as its key component. The block adopts involution to model spatially diverse variations and separately constructs hidden driven PDEs of U-wind and V-wind. The construction of PDEs in this block is achieved by a new Involution PDE (InvPDE) layers. Besides, a deep data-driven model is also introduced in Inv-GRU-PDE block as the complement to the constructed hidden PDEs for sufficiently modeling regional wind dynamics. Finally, to effectively capture the non-stationary variations of wind speed, WDMNet follows a time-variant structure for multi-step predictions. Comprehensive experiments have been conducted on two real-world datasets. Experimental results demonstrate the effectiveness and superiority of the proposed method over state-of-the-art techniques.


Assuntos
Vento
14.
IEEE/ACM Trans Comput Biol Bioinform ; 20(6): 3863-3875, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37878431

RESUMO

Few-Shot Molecular Property Prediction (FSMPP) is an improtant task on drug discovery, which aims to learn transferable knowledge from base property prediction tasks with sufficient data for predicting novel properties with few labeled molecules. Its key challenge is how to alleviate the data scarcity issue of novel properties. Pretrained Graph Neural Network (GNN) based FSMPP methods effectively address the challenge by pre-training a GNN from large-scale self-supervised tasks and then finetuning it on base property prediction tasks to perform novel property prediction. However, in this paper, we find that the GNN finetuning step is not always effective, which even degrades the performance of pretrained GNN on some novel properties. This is because these molecule-property relationships among molecules change across different properties, which results in the finetuned GNN overfits to base properties and harms the transferability performance of pretrained GNN on novel properties. To address this issue, in this paper, we propose a novel Adaptive Transfer framework of GNN for FSMPP, called ATGNN, which transfers the knowledge of pretrained and finetuned GNNs in a task-adaptive manner to adapt novel properties. Specifically, we first regard the pretrained and finetuned GNNs as model priors of target-property GNN. Then, a task-adaptive weight prediction network is designed to leverage these priors to predict target GNN weights for novel properties. Finally, we combine our ATGNN framework with existing FSMPP methods for FSMPP. Extensive experiments on four real-world datasets, i.e., Tox21, SIDER, MUV, and ToxCast, show the effectiveness of our ATGNN framework.


Assuntos
Descoberta de Drogas , Redes Neurais de Computação
15.
Neural Netw ; 152: 118-139, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-35523084

RESUMO

Wind power is a new type of green energy. Though it is economical to access and gather such energy, effectively matching the energy with consumers' demand is difficult, because of the fluctuate, intermittent and chaotic nature of wind speed. Hence, multi-step wind speed prediction becomes an important research topic. In this paper, we propose a novel deep learning method, DyanmicNet, for the problem. DynamicNet follows an encoder-decoder framework. To capture the fluctuate, intermittent and chaotic nature of wind speed, it leverages a time-variant structure to build the decoder, which is different from conventional encoder-decoder methods. In addition, a new neural block (ST-GRU-ODE) is developed, which can model the wind speed in a continuous manner by using the neural ordinary differential equation (ODE). To enhance the prediction performance, a multi-step training procedure is also put forward. Comprehensive experiments have been conducted on two real-world datasets, where wind speed is recorded in the form of two orthogonal components namely U-Wind and V-Wind. Each component can be illustrated as wind speed images. Experimental results demonstrate the effectiveness and superiority of the proposed method over state-of-the-art techniques.


Assuntos
Vento
16.
Neural Netw ; 155: 242-257, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-36081197

RESUMO

The near-surface temperature prediction (NTP) is an important spatial-temporal forecast problem, which can be used to prevent temperature crises. Most of the previous approaches fail to explicitly model the long- and short-range spatial correlations simultaneously, which is critical to making an accurate temperature prediction. In this study, both long- and short-range spatial correlations are captured to fill this gap by a novel convolution operator named Long- and Short-range Convolution (LS-Conv). The proposed LS-Conv operator includes three key components, namely, Node-based Spatial Attention (NSA), Long-range Adaptive Graph Constructor (LAGC), and Long- and Short-range Integrator (LSI). To capture long-range spatial correlations, NSA and LAGC are proposed to evaluate node importance aiming at auto-constructing long-range spatial correlations, which is named as Long-range aware Graph Convolution Network (LR-GCN). After that, the Short-range aware Convolution Neural Network (SR-CNN) accounts for the short-range spatial correlations. Finally, LSI is proposed to capture both long- and short-range spatial correlations by intra-unifying LR-GCN and SR-CNN. Upon the proposed LS-Conv operator, a new model called Long- and Short-range for NPT (LS-NTP) is developed. Extensive experiments are conducted on two real-world datasets and the results demonstrate that the proposed method outperforms state-of-the-art techniques. The source code is available on GitHub:https://github.com/xuguangning1218/LS_NTP.


Assuntos
Redes Neurais de Computação , Software , Temperatura , Atenção
17.
IEEE Trans Cybern ; 49(9): 3230-3241, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-29994344

RESUMO

Finding customer groups from transaction data is very important for retail and e-commerce companies. Recently, a "Purchase Tree" data structure is proposed to compress the customer transaction data and a local PurTree spectral clustering method is proposed to cluster the customer transaction data. However, in the PurTree distance, the node weights for the children nodes of a parent node are set as equal and the differences between different nodes are not distinguished. In this paper, we propose a two-level subspace weighting spectral clustering (TSW) algorithm for customer transaction data. In the new method, a PurTree subspace metric is proposed to measure the dissimilarity between two customers represented by two purchase trees, in which a set of level weights are introduced to distinguish the importance of different tree levels and a set of sparse node weights are introduced to distinguish the importance of different tree nodes in a purchase tree. TSW learns an adaptive similarity matrix from the local distances in order to better uncover the cluster structure buried in the customer transaction data. Simultaneously, it learns a set of level weights and a set of sparse node weights in the PurTree subspace distance. An iterative optimization algorithm is proposed to optimize the proposed model. We also present an efficient method to compute a regularization parameter in TSW. TSW was compared with six clustering algorithms on ten benchmark data sets and the experimental results show the superiority of the new method.

18.
Nat Commun ; 10(1): 470, 2019 01 28.
Artigo em Inglês | MEDLINE | ID: mdl-30692544

RESUMO

Integrative analysis of multi-omics layers at single cell level is critical for accurate dissection of cell-to-cell variation within certain cell populations. Here we report scCAT-seq, a technique for simultaneously assaying chromatin accessibility and the transcriptome within the same single cell. We show that the combined single cell signatures enable accurate construction of regulatory relationships between cis-regulatory elements and the target genes at single-cell resolution, providing a new dimension of features that helps direct discovery of regulatory patterns specific to distinct cell identities. Moreover, we generate the first single cell integrated map of chromatin accessibility and transcriptome in early embryos and demonstrate the robustness of scCAT-seq in the precise dissection of master transcription factors in cells of distinct states. The ability to obtain these two layers of omics data will help provide more accurate definitions of "single cell state" and enable the deconvolution of regulatory heterogeneity from complex cell populations.


Assuntos
Cromatina/genética , Epigenômica , Regulação da Expressão Gênica , Análise de Célula Única/métodos , Transcriptoma , Cromatina/metabolismo , Embrião de Mamíferos/citologia , Embrião de Mamíferos/metabolismo , Células HCT116 , Células HeLa , Humanos , Células K562 , Sequências Reguladoras de Ácido Nucleico/genética , Análise de Sequência de DNA/métodos
19.
IEEE Trans Neural Netw Learn Syst ; 28(8): 1787-1800, 2017 08.
Artigo em Inglês | MEDLINE | ID: mdl-28727548

RESUMO

With the advancement of data acquisition techniques, tensor (multidimensional data) objects are increasingly accumulated and generated, for example, multichannel electroencephalographies, multiview images, and videos. In these applications, the tensor objects are usually nonnegative, since the physical signals are recorded. As the dimensionality of tensor objects is often very high, a dimension reduction technique becomes an important research topic of tensor data. From the perspective of geometry, high-dimensional objects often reside in a low-dimensional submanifold of the ambient space. In this paper, we propose a new approach to perform the dimension reduction for nonnegative tensor objects. Our idea is to use nonnegative Tucker decomposition (NTD) to obtain a set of core tensors of smaller sizes by finding a common set of projection matrices for tensor objects. To preserve geometric information in tensor data, we employ a manifold regularization term for the core tensors constructed in the Tucker decomposition. An algorithm called manifold regularization NTD (MR-NTD) is developed to solve the common projection matrices and core tensors in an alternating least squares manner. The convergence of the proposed algorithm is shown, and the computational complexity of the proposed method scales linearly with respect to the number of tensor objects and the size of the tensor objects, respectively. These theoretical results show that the proposed algorithm can be efficient. Extensive experimental results have been provided to further demonstrate the effectiveness and efficiency of the proposed MR-NTD algorithm.

20.
IEEE Trans Image Process ; 25(3): 1396-409, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26849860

RESUMO

In this paper, we propose and develop a multi-visual-concept ranking (MultiVCRank) scheme for image retrieval. The key idea is that an image can be represented by several visual concepts, and a hypergraph is built based on visual concepts as hyperedges, where each edge contains images as vertices to share a specific visual concept. In the constructed hypergraph, the weight between two vertices in a hyperedge is incorporated, and it can be measured by their affinity in the corresponding visual concept. A ranking scheme is designed to compute the association scores of images and the relevance scores of visual concepts by employing input query vectors to handle image retrieval. In the scheme, the association and relevance scores are determined by an iterative method to solve limiting probabilities of a multi-dimensional Markov chain arising from the constructed hypergraph. The convergence analysis of the iteration method is studied and analyzed. Moreover, a learning algorithm is also proposed to set the parameters in the scheme, which makes it simple to use. Experimental results on the MSRC, Corel, and Caltech256 data sets have demonstrated the effectiveness of the proposed method. In the comparison, we find that the retrieval performance of MultiVCRank is substantially better than those of HypergraphRank, ManifoldRank, TOPHITS, and RankSVM.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA