Búsqueda | Portal Regional de la BVS

Gan, Wensheng; Lin, Jerry Chun-Wei; Zhang, Jiexiong; Fournier-Viger, Philippe; Chao, Han-Chieh; Yu, Philip S.

IEEE Trans Cybern ; 51(2): 487-500, 2021 Feb.

Artículo en Inglés | MEDLINE | ID: mdl-32142464

RESUMEN

High-utility sequential pattern (HUSP) mining is an emerging topic in the field of knowledge discovery in databases. It consists of discovering subsequences that have a high utility (importance) in sequences, which can be referred to as HUSPs. HUSPs can be applied to many real-life applications, such as market basket analysis, e-commerce recommendations, click-stream analysis, and route planning. Several algorithms have been proposed to efficiently mine utility-based useful sequential patterns. However, due to the combinatorial explosion of the search space for low utility threshold and large-scale data, the performances of these algorithms are unsatisfactory in terms of runtime and memory usage. Hence, this article proposes an efficient algorithm for the task of HUSP mining, called HUSP mining with UL-list (HUSP-ULL). It utilizes a lexicographic q -sequence (LQS)-tree and a utility-linked (UL)-list structure to quickly discover HUSPs. Furthermore, two pruning strategies are introduced in HUSP-ULL to obtain tight upper bounds on the utility of the candidate sequences and reduce the search space by pruning unpromising candidates early. Substantial experiments on both real-life and synthetic datasets showed that HUSP-ULL can effectively and efficiently discover the complete set of HUSPs and that it outperforms the state-of-the-art algorithms.

HUOPM: High-Utility Occupancy Pattern Mining.

Gan, Wensheng; Lin, Jerry Chun-Wei; Fournier-Viger, Philippe; Chao, Han-Chieh; Yu, Philip S.

IEEE Trans Cybern ; 50(3): 1195-1208, 2020 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-30794524

RESUMEN

Mining useful patterns from varied types of databases is an important research topic, which has many real-life applications. Most studies have considered the frequency as sole interestingness measure to identify high-quality patterns. However, each object is different in nature. The relative importance of objects is not equal, in terms of criteria, such as the utility, risk, or interest. Besides, another limitation of frequent patterns is that they generally have a low occupancy, that is, they often represent small sets of items in transactions containing many items and, thus, may not be truly representative of these transactions. To extract high-quality patterns in real-life applications, this paper extends the occupancy measure to also assess the utility of patterns in transaction databases. We propose an efficient algorithm named high-utility occupancy pattern mining (HUOPM). It considers user preferences in terms of frequency, utility, and occupancy. A novel frequency-utility tree and two compact data structures, called the utility-occupancy list and frequency-utility table, are designed to provide global and partial downward closure properties for pruning the search space. The proposed method can efficiently discover the complete set of high-quality patterns without candidate generation. Extensive experiments have been conducted on several datasets to evaluate the effectiveness and efficiency of the proposed algorithm. Results show that the derived patterns are intelligible, reasonable, and acceptable, and that HUOPM with its pruning strategies outperforms the state-of-the-art algorithm, in terms of runtime and search space, respectively.

An incremental high-utility mining algorithm with transaction insertion.

Lin, Jerry Chun-Wei; Gan, Wensheng; Hong, Tzung-Pei; Zhang, Binbin.

ScientificWorldJournal ; 2015: 161564, 2015.

Artículo en Inglés | MEDLINE | ID: mdl-25811038

RESUMEN

Association-rule mining is commonly used to discover useful and meaningful patterns from a very large database. It only considers the occurrence frequencies of items to reveal the relationships among itemsets. Traditional association-rule mining is, however, not suitable in real-world applications since the purchased items from a customer may have various factors, such as profit or quantity. High-utility mining was designed to solve the limitations of association-rule mining by considering both the quantity and profit measures. Most algorithms of high-utility mining are designed to handle the static database. Fewer researches handle the dynamic high-utility mining with transaction insertion, thus requiring the computations of database rescan and combination explosion of pattern-growth mechanism. In this paper, an efficient incremental algorithm with transaction insertion is designed to reduce computations without candidate generation based on the utility-list structures. The enumeration tree and the relationships between 2-itemsets are also adopted in the proposed algorithm to speed up the computations. Several experiments are conducted to show the performance of the proposed algorithm in terms of runtime, memory consumption, and number of generated patterns.

RESUMEN

RESUMEN

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA