Búsqueda | Portal de Búsqueda de la BVS Colombia

1.

Comparative Analysis of Deterministic and Nondeterministic Decision Trees for Decision Tables from Closed Classes.

Ostonov, Azimkhon; Moshkov, Mikhail.

Entropy (Basel) ; 26(6)2024 Jun 17.

Artículo en Inglés | MEDLINE | ID: mdl-38920528

RESUMEN

In this paper, we consider classes of decision tables with many-valued decisions closed under operations of the removal of columns, the changing of decisions, the permutation of columns, and the duplication of columns. We study relationships among three parameters of these tables: the complexity of a decision table (if we consider the depth of the decision trees, then the complexity of a decision table is the number of columns in it), the minimum complexity of a deterministic decision tree, and the minimum complexity of a nondeterministic decision tree. We consider the rough classification of functions characterizing relationships and enumerate all possible seven types of relationships.

2.

Decision Trees for Binary Subword-Closed Languages.

Moshkov, Mikhail.

Entropy (Basel) ; 25(2)2023 Feb 14.

Artículo en Inglés | MEDLINE | ID: mdl-36832715

RESUMEN

In this paper, we study arbitrary subword-closed languages over the alphabet {0,1} (binary subword-closed languages). For the set of words L(n) of the length n belonging to a binary subword-closed language L, we investigate the depth of the decision trees solving the recognition and the membership problems deterministically and nondeterministically. In the case of the recognition problem, for a given word from L(n), we should recognize it using queries, each of which, for some i∈{1,,n}, returns the ith letter of the word. In the case of the membership problem, for a given word over the alphabet {0,1} of the length n, we should recognize if it belongs to the set L(n) using the same queries. With the growth of n, the minimum depth of the decision trees solving the problem of recognition deterministically is either bounded from above by a constant or grows as a logarithm, or linearly. For other types of trees and problems (decision trees solving the problem of recognition nondeterministically and decision trees solving the membership problem deterministically and nondeterministically), with the growth of n, the minimum depth of the decision trees is either bounded from above by a constant or grows linearly. We study the joint behavior of the minimum depths of the considered four types of decision trees and describe five complexity classes of binary subword-closed languages.

3.

On Complexity of Deterministic and Nondeterministic Decision Trees for Conventional Decision Tables from Closed Classes.

Ostonov, Azimkhon; Moshkov, Mikhail.

Entropy (Basel) ; 25(10)2023 Oct 03.

Artículo en Inglés | MEDLINE | ID: mdl-37895532

RESUMEN

In this paper, we consider classes of conventional decision tables closed relative to the removal of attributes (columns) and changing decisions assigned to rows. For tables from an arbitrary closed class, we study the dependence of the minimum complexity of deterministic and nondeterministic decision trees on the complexity of the set of attributes attached to columns. We also study the dependence of the minimum complexity of deterministic decision trees on the minimum complexity of nondeterministic decision trees. Note that a nondeterministic decision tree can be interpreted as a set of true decision rules that covers all rows of the table.

4.

Applications of Depth Minimization of Decision Trees Containing Hypotheses for Multiple-Value Decision Tables.

Azad, Mohammad; Moshkov, Mikhail.

Entropy (Basel) ; 25(4)2023 Mar 23.

Artículo en Inglés | MEDLINE | ID: mdl-37190335

RESUMEN

In this research, we consider decision trees that incorporate standard queries with one feature per query as well as hypotheses consisting of all features' values. These decision trees are used to represent knowledge and are comparable to those investigated in exact learning, in which membership queries and equivalence queries are used. As an application, we look into the issue of creating decision trees for two cases: the sorting of a sequence that contains equal elements and multiple-value decision tables which are modified from UCI Machine Learning Repository. We contrast the efficiency of several forms of optimal (considering the parameter depth) decision trees with hypotheses for the aforementioned applications. We also investigate the efficiency of decision trees built by dynamic programming and by an entropy-based greedy method. We discovered that the greedy algorithm produces very similar results compared to the results of dynamic programming algorithms. Therefore, since the dynamic programming algorithms take a long time, we may readily apply the greedy algorithms.

5.

On the Depth of Decision Trees with Hypotheses.

Moshkov, Mikhail.

Entropy (Basel) ; 24(1)2022 Jan 12.

Artículo en Inglés | MEDLINE | ID: mdl-35052142

RESUMEN

In this paper, based on the results of rough set theory, test theory, and exact learning, we investigate decision trees over infinite sets of binary attributes represented as infinite binary information systems. We define the notion of a problem over an information system and study three functions of the Shannon type, which characterize the dependence in the worst case of the minimum depth of a decision tree solving a problem on the number of attributes in the problem description. The considered three functions correspond to (i) decision trees using attributes, (ii) decision trees using hypotheses (an analog of equivalence queries from exact learning), and (iii) decision trees using both attributes and hypotheses. The first function has two possible types of behavior: logarithmic and linear (this result follows from more general results published by the author earlier). The second and the third functions have three possible types of behavior: constant, logarithmic, and linear (these results were published by the author earlier without proofs that are given in the present paper). Based on the obtained results, we divided the set of all infinite binary information systems into four complexity classes. In each class, the type of behavior for each of the considered three functions does not change.

6.

Selected Data Mining Tools for Data Analysis in Distributed Environment.

Moshkov, Mikhail; Zielosko, Beata; Tetteh, Evans Teiko.

Entropy (Basel) ; 24(10)2022 Oct 01.

Artículo en Inglés | MEDLINE | ID: mdl-37420421

RESUMEN

In this paper, we deal with distributed data represented either as a finite set T of decision tables with equal sets of attributes or a finite set I of information systems with equal sets of attributes. In the former case, we discuss a way to the study decision trees common to all tables from the set T: building a decision table in which the set of decision trees coincides with the set of decision trees common to all tables from T. We show when we can build such a decision table and how to build it in a polynomial time. If we have such a table, we can apply various decision tree learning algorithms to it. We extend the considered approach to the study of test (reducts) and decision rules common to all tables from T. In the latter case, we discuss a way to study the association rules common to all information systems from the set I: building a joint information system for which the set of true association rules that are realizable for a given row ρ and have a given attribute a on the right-hand side coincides with the set of association rules that are true for all information systems from I, have the attribute a on the right-hand side, and are realizable for the row ρ. We then show how to build a joint information system in a polynomial time. When we build such an information system, we can apply various association rule learning algorithms to it.

7.

Entropy-Based Greedy Algorithm for Decision Trees Using Hypotheses.

Azad, Mohammad; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail.

Entropy (Basel) ; 23(7)2021 Jun 25.

Artículo en Inglés | MEDLINE | ID: mdl-34201971

RESUMEN

In this paper, we consider decision trees that use both conventional queries based on one attribute each and queries based on hypotheses of values of all attributes. Such decision trees are similar to those studied in exact learning, where membership and equivalence queries are allowed. We present greedy algorithm based on entropy for the construction of the above decision trees and discuss the results of computer experiments on various data sets and randomly generated Boolean functions.

8.

Decision Rules Derived from Optimal Decision Trees with Hypotheses.

Azad, Mohammad; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail; Zielosko, Beata.

Entropy (Basel) ; 23(12)2021 Dec 07.

Artículo en Inglés | MEDLINE | ID: mdl-34945947

RESUMEN

Conventional decision trees use queries each of which is based on one attribute. In this study, we also examine decision trees that handle additional queries based on hypotheses. This kind of query is similar to the equivalence queries considered in exact learning. Earlier, we designed dynamic programming algorithms for the computation of the minimum depth and the minimum number of internal nodes in decision trees that have hypotheses. Modification of these algorithms considered in the present paper permits us to build decision trees with hypotheses that are optimal relative to the depth or relative to the number of the internal nodes. We compare the length and coverage of decision rules extracted from optimal decision trees with hypotheses and decision rules extracted from optimal conventional decision trees to choose the ones that are preferable as a tool for the representation of information. To this end, we conduct computer experiments on various decision tables from the UCI Machine Learning Repository. In addition, we also consider decision tables for randomly generated Boolean functions. The collected results show that the decision rules derived from decision trees with hypotheses in many cases are better than the rules extracted from conventional decision trees.

9.

Learning probabilistic models of hydrogen bond stability from molecular dynamics simulation trajectories.

Chikalov, Igor; Yao, Peggy; Moshkov, Mikhail; Latombe, Jean-Claude.

BMC Bioinformatics ; 12 Suppl 1: S34, 2011 Feb 15.

Artículo en Inglés | MEDLINE | ID: mdl-21342565

RESUMEN

BACKGROUND: Hydrogen bonds (H-bonds) play a key role in both the formation and stabilization of protein structures. They form and break while a protein deforms, for instance during the transition from a non-functional to a functional state. The intrinsic strength of an individual H-bond has been studied from an energetic viewpoint, but energy alone may not be a very good predictor. METHODS: This paper describes inductive learning methods to train protein-independent probabilistic models of H-bond stability from molecular dynamics (MD) simulation trajectories of various proteins. The training data contains 32 input attributes (predictors) that describe an H-bond and its local environment in a conformation c and the output attribute is the probability that the H-bond will be present in an arbitrary conformation of this protein achievable from c within a time duration Δ. We model dependence of the output variable on the predictors by a regression tree. RESULTS: Several models are built using 6 MD simulation trajectories containing over 4000 distinct H-bonds (millions of occurrences). Experimental results demonstrate that such models can predict H-bond stability quite well. They perform roughly 20% better than models based on H-bond energy alone. In addition, they can accurately identify a large fraction of the least stable H-bonds in a conformation. In most tests, about 80% of the 10% H-bonds predicted as the least stable are actually among the 10% truly least stable. The important attributes identified during the tree construction are consistent with previous findings. CONCLUSIONS: We use inductive learning methods to build protein-independent probabilistic models to study H-bond stability, and demonstrate that the models perform better than H-bond energy alone.

Asunto(s)

Enlace de Hidrógeno , Modelos Estadísticos , Simulación de Dinámica Molecular , Proteínas/química , Algoritmos , Estabilidad Proteica , Estructura Secundaria de Proteína

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA