Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
IEEE Trans Pattern Anal Mach Intell ; 46(5): 3169-3182, 2024 May.
Artículo en Inglés | MEDLINE | ID: mdl-38039175

RESUMEN

Various correlations hidden in crowdsourcing annotation tasks bring opportunities to further improve the accuracy of label aggregation. However, these relationships are usually extremely difficult to be modeled. Most existing methods can merely make use of one or two correlations. In this paper, we propose a novel graph neural network model, namely LAGNN, which models five different correlations in crowdsourced annotation tasks by utilizing deep graph neural networks with convolution operations and derives a high label aggregation performance. Utilizing the group of high quality workers through labeling similarity, LAGNN can efficiently revise the preference among workers. Moreover, by injecting a little ground truth in its training stage, the label aggregation performance of LAGNN can be further significantly improved. We evaluate LAGNN on a large number of simulated datasets generated through varying six degrees of freedom and on eight real-world crowdsourcing datasets in both supervised and unsupervised (agnostic) modes. Experiments on data leakage is also contained. Experimental results consistently show that the proposed LAGNN significantly outperforms six state-of-the-art models in terms of label aggregation accuracy.

2.
Artículo en Inglés | MEDLINE | ID: mdl-37018300

RESUMEN

In biochemistry, graph structures have been widely used for modeling compounds, proteins, functional interactions, etc. A common task that divides these graphs into different categories, known as graph classification, highly relies on the quality of the representations of graphs. With the advance in graph neural networks, message-passing-based methods are adopted to iteratively aggregate neighborhood information for better graph representations. These methods, though powerful, still suffer from some shortcomings. The first challenge is that pooling-based methods in graph neural networks may sometimes ignore the part-whole hierarchies naturally existing in graph structures. These part-whole relationships are usually valuable for many molecular function prediction tasks. The second challenge is that most existing methods do not take the heterogeneity embedded in graph representations into consideration. Disentangling the heterogeneity will increase the performance and interpretability of models. This paper proposes a graph capsule network for graph classification tasks with disentangled feature representations learned automatically by well-designed algorithms. This method is capable of, on the one hand, decomposing heterogeneous representations to more fine-grained elements, whilst on the other hand, capturing part-whole relationships using capsules. Extensive experiments performed on several public-available biochemistry datasets demonstrated the effectiveness of the proposed method, compared with nine state-of-the-art graph learning methods.

3.
Artículo en Inglés | MEDLINE | ID: mdl-35917576

RESUMEN

Multilabel annotation is a critical step to generate training sets when learning classification models in various application domains, but asking domain experts to provide labels is usually time-consuming and expensive, which cannot meet the current requirement of the fast evolution of the models in the big data era. Although crowdsourcing provides a fast solution to acquire labels for multilabel learning, it faces the risk of high data acquisition cost and low label quality. This article proposes a novel one-coin label-dependent active crowdsourcing (OCLDAC) method to iteratively query noisy labels from crowd workers and learn multilabel classification models. In each iteration of active learning, integrated labels of instances are first inferred by a novel one-coin label-dependent model, which utilizes a mixture of multiple independent Bernoulli distributions to explore and exploit correlations among the labels to increase the accuracy of truth inference. Then, instances, labels, and workers are selected according to the novel strategies that incorporate the distribution of noisy labels, the prediction probability of learning models, label correlations, and the reliability of crowd workers. Simulations on eight multilabel datasets and evaluation on one real-world crowdsourcing dataset consistently show that the proposed OCLDAC significantly outperforms the state-of-the-art methods and their variants.

4.
J Pers Med ; 11(10)2021 Oct 19.
Artículo en Inglés | MEDLINE | ID: mdl-34683185

RESUMEN

The liver is an irreplaceable organ in the human body, maintaining life activities and metabolism. Malignant tumors of the liver have a high mortality rate at present. Computer-aided segmentation of the liver and tumors has significant effects on clinical diagnosis and treatment. There are still many challenges in the segmentation of the liver and liver tumors simultaneously, such as, on the one hand, that convolutional kernels with fixed geometric structures do not match complex, irregularly shaped targets; on the other, pooling during convolution results in a loss of spatial contextual information of images. In this work, we designed a cascaded U-ADenseNet with coarse-to-fine processing for addressing the above issues of fully automatic segmentation. This work contributes multi-resolution input images and multi-layered channel attention combined with atrous spatial pyramid pooling densely connected in the fine segmentation. The proposed model was evaluated by a public dataset of the Liver Tumor Segmentation Challenge (LiTS). Our approach attained competitive liver and tumor segmentation scores that exceeded other methods across a wide range of metrics.

5.
ACM Comput Surv ; 53(2)2020 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-34421185

RESUMEN

Image classification is a key task in image understanding, and multi-label image classification has become a popular topic in recent years. However, the success of multi-label image classification is closely related to the way of constructing a training set. As active learning aims to construct an effective training set through iteratively selecting the most informative examples to query labels from annotators, it was introduced into multi-label image classification. Accordingly, multi-label active learning is becoming an important research direction. In this work, we first review existing multi-label active learning algorithms for image classification. These algorithms can be categorized into two top groups from two aspects respectively: sampling and annotation. The most important component of multi-label active learning is to design an effective sampling strategy that actively selects the examples with the highest informativeness from an unlabeled data pool, according to various information measures. Thus, different informativeness measures are emphasized in this survey. Furthermore, this work also makes a deep investigation on existing challenging issues and future promises in multi-label active learning with a focus on four core aspects: example dimension, label dimension, annotation, and application extension.

6.
Comput Methods Programs Biomed ; 175: 73-82, 2019 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-31104716

RESUMEN

Medical image fusion is important in the field of clinical diagnosis because it can improve the availability of information contained in images. Magnetic Resonance Imaging (MRI) provides excellent anatomical details as well as functional information on regional changes in physiology, hemodynamics, and tissue composition. In contrast, although the spatial resolution of Positron Emission Tomography (PET) provides is lower than that an MRI, PET is capable of depicting the tissue's molecular and pathological activities that are not available from MRI. Fusion of MRI and PET may allow us to combine the advantages of both imaging modalities and achieve more precise localization and characterization of abnormalities. Previous image fusion algorithms, based on the estimation theory, assume that all distortions follow Gaussian distribution and are therefore susceptible to the model mismatch problem. To overcome this mismatch problem, we propose a new image fusion method with multi-resolution and nonparametric density models (MRNDM). The RGB space registered from the source multi-modal medical images is first transformed into a generalized intensity-hue-saturation space (GIHS), and then is decomposed into the low- and high-frequency components using the non-subsampled contourlet transform (NSCT). Two different fusion rules, which are based on the nonparametric density model and the theory of variable-weight, are developed and used to fuse low- and high-frequency coefficients. The fused images are constructed by performing the inverse of the NSCT operation with all composite coefficients. Our experimental results demonstrate that the quality of images fused from PET and MRI brain images using our proposed method MRNDM is higher than that of those fused using six previous fusion methods.


Asunto(s)
Encéfalo/diagnóstico por imagen , Imagen por Resonancia Magnética , Tomografía de Emisión de Positrones , Algoritmos , Enfermedad de Alzheimer/diagnóstico por imagen , Mapeo Encefálico , Neoplasias Encefálicas/diagnóstico por imagen , Fluorodesoxiglucosa F18/química , Humanos , Interpretación de Imagen Asistida por Computador/métodos , Distribución Normal , Análisis de Componente Principal
7.
IEEE Trans Neural Netw Learn Syst ; 30(10): 3172-3185, 2019 10.
Artículo en Inglés | MEDLINE | ID: mdl-30703041

RESUMEN

With online crowdsourcing platforms, labels can be acquired at relatively low costs from massive nonexpert workers. To improve the quality of labels obtained from these imperfect crowdsourced workers, we usually let different workers provide labels for the same instance. Then, the true labels for all instances are estimated from these multiple noisy labels. This traditional general-purpose label aggregation process, solely relying on the collected noisy labels, cannot significantly improve the accuracy of integrated labels under a low labeling quality circumstance. This paper proposes a novel bilayer collaborative clustering (BLCC) method for the label aggregation in crowdsourcing. BLCC first generates the conceptual-level features for the instances from their multiple noisy labels and infers the initially integrated labels by performing clustering on the conceptual-level features. Then, it performs another clustering on the physical-level features to form the estimations of the true labels on the physical layer. The clustering results on both layers can facilitate in tracking the changes in the uncertainties of the instances. Finally, the initially integrated labels that are likely to be wrongly inferred on the conceptual layer can be addressed using the estimated labels on the physical layer. The clustering processes on both layers can keep providing guidance information for each other in the multiple label remedy rounds. The experimental results on 12 real-world crowdsourcing data sets show that the performance of the proposed method in terms of accuracy is better than that of the state-of-the-art methods.

8.
Math Biosci Eng ; 17(2): 1041-1058, 2019 11 11.
Artículo en Inglés | MEDLINE | ID: mdl-32233569

RESUMEN

In this paper, a linguistic steganalysis method based on two-level cascaded convolutional neural networks (CNNs) is proposed to improve the system's ability to detect stego texts, which are generated via synonym substitutions. The first-level network, sentence-level CNN, consists of one convolutional layer with multiple convolutional kernels in different window sizes, one pooling layer to deal with variable sentence lengths, and one fully connected layer with dropout as well as a softmax output, such that two final steganographic features are obtained for each sentence. The unmodified and modified sentences, along with their words, are represented in the form of pre-trained dense word embeddings, which serve as the input of the network. Sentence-level CNN provides the representation of a sentence, and can thus be utilized to predict whether a sentence is unmodified or has been modified by synonym substitutions. In the second level, a text-level CNN exploits the predicted representations of sentences obtained from the sentence-level CNN to determine whether the detected text is a stego text or cover text. Experimental results indicate that the proposed sentence-level CNN can effectively extract sentence features for sentence-level steganalysis tasks and reaches an average accuracy of 82.245%. Moreover, the proposed steganalysis method achieves greatly improved detection performance when distinguishing stego texts from cover texts.

9.
Sensors (Basel) ; 18(7)2018 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-29966366

RESUMEN

Internal reliability and external safety of Wireless Sensor Networks (WSN) data transmission have become increasingly outstanding issues with the wide applications of WSN. This paper proposes a new method for access control and mitigation of interfering noise in time synchronization environments. First, a formal definition is given regarding the impact interference noise has on the clock skew and clock offset of each node. The degree of node interference behavior is estimated dynamically from the perspective of time-stamp changes caused by the interference noise. Secondly, a general access control model is proposed to resist invasion of noise interference. A prediction model is constructed using the Bayesian method for calculating the reliability of neighbor node behavior in the proposed model. Interference noise, which attacks the time synchronization, is regarded as the key factor for probability estimation of the reliability. The result of the calculations determines whether it is necessary to initiate synchronization filtering. Finally, a division of trust levels with bilinear definition is employed to lower interference noise and improve the quality of interference detection. Experimental results show that this model has advantages in system overhead, energy consumption and testing errors, compared to its counterparts. When the disturbance intensity of a WSN increases, the proposed optimized algorithm converges faster with a lower network communication load.

10.
IEEE Trans Neural Netw Learn Syst ; 29(9): 4462-4472, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29990069

RESUMEN

Parameter in learning problems (usually arising from the tradeoff between training error minimization and regularization) is often tuned by cross validation (CV). A solution path provides a compact representation of all optimal solutions, which can be used to determine the parameter with the global minimum CV error, without solving original optimization problems multiple times based on grid search. However, existing solution path algorithms do not provide a unified implementation to various learning problems. In this paper, we first introduce a general parametric quadratic programming (PQP) problem that can be instantiated to an extensive number of learning problems. Then, we propose a generalized solution path (GSP) for the general PQP problem. Particularly, we use the $QR$ decomposition to handle singularities in GSP. Finally, we analyze the finite convergence and the time complexity of GSP. Our experimental results on a variety of data sets not only confirm the identicality between GSP and several existing solution path algorithms but also show the superiority of our GSP over the existing solution path algorithms on both generalization and robustness. Finally, we provide a practical guild of using the GSP to solve two important learning problems, i.e., generalized error path and Ivanov SVM.

11.
IEEE Trans Neural Netw Learn Syst ; 29(5): 1675-1688, 2018 05.
Artículo en Inglés | MEDLINE | ID: mdl-28333645

RESUMEN

Crowdsourcing systems provide a cost effective and convenient way to collect labels, but they often fail to guarantee the quality of the labels. This paper proposes a novel framework that introduces noise correction techniques to further improve the quality of integrated labels that are inferred from the multiple noisy labels of objects. In the proposed general framework, information about the qualities of labelers estimated by a front-end ground truth inference algorithm is utilized to supervise subsequent label noise filtering and correction. The framework uses a novel algorithm termed adaptive voting noise correction (AVNC) to precisely identify and correct the potential noisy labels. After filtering out the instances with noisy labels, the remaining cleansed data set is used to create multiple weak classifiers, based on which a powerful ensemble classifier is induced to correct these noises. Experimental results on eight simulated data sets with different kinds of features and two real-world crowdsourcing data sets in different domains consistently show that: 1) the proposed framework can improve label quality regardless of inference algorithms, especially under the circumstance that each instance has a few repeated labels and 2) since the proposed AVNC algorithm considers both the number of and the probability of potential label noises, it outperforms the state-of-the-art noise correction algorithms.

12.
Comput Intell Neurosci ; 2017: 4092135, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28588611

RESUMEN

Machine-constructed knowledge bases often contain noisy and inaccurate facts. There exists significant work in developing automated algorithms for knowledge base refinement. Automated approaches improve the quality of knowledge bases but are far from perfect. In this paper, we leverage crowdsourcing to improve the quality of automatically extracted knowledge bases. As human labelling is costly, an important research challenge is how we can use limited human resources to maximize the quality improvement for a knowledge base. To address this problem, we first introduce a concept of semantic constraints that can be used to detect potential errors and do inference among candidate facts. Then, based on semantic constraints, we propose rank-based and graph-based algorithms for crowdsourced knowledge refining, which judiciously select the most beneficial candidate facts to conduct crowdsourcing and prune unnecessary questions. Our experiments show that our method improves the quality of knowledge bases significantly and outperforms state-of-the-art automatic methods under a reasonable crowdsourcing cost.


Asunto(s)
Algoritmos , Colaboración de las Masas , Bases del Conocimiento , Automatización , Colaboración de las Masas/economía , Semántica
13.
IEEE Trans Pattern Anal Mach Intell ; 39(6): 1103-1121, 2017 06.
Artículo en Inglés | MEDLINE | ID: mdl-27295653

RESUMEN

Model selection plays an important role in cost-sensitive SVM (CS-SVM). It has been proven that the global minimum cross validation (CV) error can be efficiently computed based on the solution path for one parameter learning problems. However, it is a challenge to obtain the global minimum CV error for CS-SVM based on one-dimensional solution path and traditional grid search, because CS-SVM is with two regularization parameters. In this paper, we propose a solution and error surfaces based CV approach (CV-SES). More specifically, we first compute a two-dimensional solution surface for CS-SVM based on a bi-parameter space partition algorithm, which can fit solutions of CS-SVM for all values of both regularization parameters. Then, we compute a two-dimensional validation error surface for each CV fold, which can fit validation errors of CS-SVM for all values of both regularization parameters. Finally, we obtain the CV error surface by superposing K validation error surfaces, which can find the global minimum CV error of CS-SVM. Experiments are conducted on seven datasets for cost sensitive learning and on four datasets for imbalanced learning. Experimental results not only show that our proposed CV-SES has a better generalization ability than CS-SVM with various hybrids between grid search and solution path methods, and than recent proposed cost-sensitive hinge loss SVM with three-dimensional grid search, but also show that CV-SES uses less running time.

14.
IEEE Trans Neural Netw Learn Syst ; 28(7): 1646-1656, 2017 07.
Artículo en Inglés | MEDLINE | ID: mdl-27101618

RESUMEN

Minimax probability machine (MPM) is an interesting discriminative classifier based on generative prior knowledge. It can directly estimate the probabilistic accuracy bound by minimizing the maximum probability of misclassification. The structural information of data is an effective way to represent prior knowledge, and has been found to be vital for designing classifiers in real-world problems. However, MPM only considers the prior probability distribution of each class with a given mean and covariance matrix, which does not efficiently exploit the structural information of data. In this paper, we use two finite mixture models to capture the structural information of the data from binary classification. For each subdistribution in a finite mixture model, only its mean and covariance matrix are assumed to be known. Based on the finite mixture models, we propose a structural MPM (SMPM). SMPM can be solved effectively by a sequence of the second-order cone programming problems. Moreover, we extend a linear model of SMPM to a nonlinear model by exploiting kernelization techniques. We also show that the SMPM can be interpreted as a large margin classifier and can be transformed to support vector machine and maxi-min margin machine under certain special conditions. Experimental results on both synthetic and real-world data sets demonstrate the effectiveness of SMPM.

15.
IEEE Trans Neural Netw Learn Syst ; 28(5): 1241-1248, 2017 05.
Artículo en Inglés | MEDLINE | ID: mdl-26929067

RESUMEN

The ν -support vector classification has the advantage of using a regularization parameter ν to control the number of support vectors and margin errors. Recently, a regularization path algorithm for ν -support vector classification ( ν -SvcPath) suffers exceptions and singularities in some special cases. In this brief, we first present a new equivalent dual formulation for ν -SVC and, then, propose a robust ν -SvcPath, based on lower upper decomposition with partial pivoting. Theoretical analysis and experimental results verify that our proposed robust regularization path algorithm can avoid the exceptions completely, handle the singularities in the key matrix, and fit the entire solution path in a finite number of steps. Experimental results also show that our proposed algorithm fits the entire solution path with fewer steps and less running time than original one does.

16.
Comput Intell Neurosci ; 2016: 7386517, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27313603

RESUMEN

For many applications, finding rare instances or outliers can be more interesting than finding common patterns. Existing work in outlier detection never considers the context of deep web. In this paper, we argue that, for many scenarios, it is more meaningful to detect outliers over deep web. In the context of deep web, users must submit queries through a query interface to retrieve corresponding data. Therefore, traditional data mining methods cannot be directly applied. The primary contribution of this paper is to develop a new data mining method for outlier detection over deep web. In our approach, the query space of a deep web data source is stratified based on a pilot sample. Neighborhood sampling and uncertainty sampling are developed in this paper with the goal of improving recall and precision based on stratification. Finally, a careful performance evaluation of our algorithm confirms that our approach can effectively detect outliers in deep web.


Asunto(s)
Algoritmos , Almacenamiento y Recuperación de la Información , Internet , Comercio , Sistemas de Administración de Bases de Datos , Humanos
17.
Neural Netw ; 67: 140-50, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25933108

RESUMEN

The ν-Support Vector Regression (ν-SVR) is an effective regression learning algorithm, which has the advantage of using a parameter ν on controlling the number of support vectors and adjusting the width of the tube automatically. However, compared to ν-Support Vector Classification (ν-SVC) (Schölkopf et al., 2000), ν-SVR introduces an additional linear term into its objective function. Thus, directly applying the accurate on-line ν-SVC algorithm (AONSVM) to ν-SVR will not generate an effective initial solution. It is the main challenge to design an incremental ν-SVR learning algorithm. To overcome this challenge, we propose a special procedure called initial adjustments in this paper. This procedure adjusts the weights of ν-SVC based on the Karush-Kuhn-Tucker (KKT) conditions to prepare an initial solution for the incremental learning. Combining the initial adjustments with the two steps of AONSVM produces an exact and effective incremental ν-SVR learning algorithm (INSVR). Theoretical analysis has proven the existence of the three key inverse matrices, which are the cornerstones of the three steps of INSVR (including the initial adjustments), respectively. The experiments on benchmark datasets demonstrate that INSVR can avoid the infeasible updating paths as far as possible, and successfully converges to the optimal solution. The results also show that INSVR is faster than batch ν-SVR algorithms with both cold and warm starts.


Asunto(s)
Aprendizaje Automático , Máquina de Vectores de Soporte , Algoritmos , Humanos , Modelos Lineales , Distribución Normal
18.
Comput Intell Neurosci ; 2015: 109806, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-25834570

RESUMEN

To improve the classification performance of imbalanced learning, a novel oversampling method, immune centroids oversampling technique (ICOTE) based on an immune network, is proposed. ICOTE generates a set of immune centroids to broaden the decision regions of the minority class space. The representative immune centroids are regarded as synthetic examples in order to resolve the imbalance problem. We utilize an artificial immune network to generate synthetic examples on clusters with high data densities, which can address the problem of synthetic minority oversampling technique (SMOTE), which lacks reflection on groups of training examples. Meanwhile, we further improve the performance of ICOTE via integrating ENN with ICOTE, that is, ICOTE + ENN. ENN disposes the majority class examples that invade the minority class space, so ICOTE + ENN favors the separation of both classes. Our comprehensive experimental results show that two proposed oversampling methods can achieve better performance than the renowned resampling methods.


Asunto(s)
Algoritmos , Inteligencia Artificial , Sistema Inmunológico , Aprendizaje , Muestreo , Área Bajo la Curva , Humanos
19.
IEEE Trans Neural Netw Learn Syst ; 26(7): 1403-16, 2015 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-25134094

RESUMEN

Support vector ordinal regression (SVOR) is a popular method to tackle ordinal regression problems. However, until now there were no effective algorithms proposed to address incremental SVOR learning due to the complicated formulations of SVOR. Recently, an interesting accurate on-line algorithm was proposed for training ν -support vector classification (ν-SVC), which can handle a quadratic formulation with a pair of equality constraints. In this paper, we first present a modified SVOR formulation based on a sum-of-margins strategy. The formulation has multiple constraints, and each constraint includes a mixture of an equality and an inequality. Then, we extend the accurate on-line ν-SVC algorithm to the modified formulation, and propose an effective incremental SVOR algorithm. The algorithm can handle a quadratic formulation with multiple constraints, where each constraint is constituted of an equality and an inequality. More importantly, it tackles the conflicts between the equality and inequality constraints. We also provide the finite convergence analysis for the algorithm. Numerical experiments on the several benchmark and real-world data sets show that the incremental algorithm can converge to the optimal solution in a finite number of steps, and is faster than the existing batch and incremental SVOR algorithms. Meanwhile, the modified formulation has better accuracy than the existing incremental SVOR algorithm, and is as accurate as the sum-of-margins based formulation of Shashua and Levin.

20.
ScientificWorldJournal ; 2014: 834013, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24605045

RESUMEN

A motion trajectory is an intuitive representation form in time-space domain for a micromotion behavior of moving target. Trajectory analysis is an important approach to recognize abnormal behaviors of moving targets. Against the complexity of vehicle trajectories, this paper first proposed a trajectory pattern learning method based on dynamic time warping (DTW) and spectral clustering. It introduced the DTW distance to measure the distances between vehicle trajectories and determined the number of clusters automatically by a spectral clustering algorithm based on the distance matrix. Then, it clusters sample data points into different clusters. After the spatial patterns and direction patterns learned from the clusters, a recognition method for detecting vehicle abnormal behaviors based on mixed pattern matching was proposed. The experimental results show that the proposed technical scheme can recognize main types of traffic abnormal behaviors effectively and has good robustness. The real-world application verified its feasibility and the validity.


Asunto(s)
Modelos Teóricos , Movimiento (Física) , Algoritmos , Vehículos a Motor , Reconocimiento de Normas Patrones Automatizadas
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA