Búsqueda | Biblioteca Virtual en Salud

Entropy-Based Greedy Algorithm for Decision Trees Using Hypotheses.

Azad, Mohammad; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail.

Entropy (Basel) ; 23(7)2021 Jun 25.

Artículo en Inglés | MEDLINE | ID: mdl-34201971

RESUMEN

In this paper, we consider decision trees that use both conventional queries based on one attribute each and queries based on hypotheses of values of all attributes. Such decision trees are similar to those studied in exact learning, where membership and equivalence queries are allowed. We present greedy algorithm based on entropy for the construction of the above decision trees and discuss the results of computer experiments on various data sets and randomly generated Boolean functions.

Decision Rules Derived from Optimal Decision Trees with Hypotheses.

Azad, Mohammad; Chikalov, Igor; Hussain, Shahid; Moshkov, Mikhail; Zielosko, Beata.

Entropy (Basel) ; 23(12)2021 Dec 07.

Artículo en Inglés | MEDLINE | ID: mdl-34945947

RESUMEN

Conventional decision trees use queries each of which is based on one attribute. In this study, we also examine decision trees that handle additional queries based on hypotheses. This kind of query is similar to the equivalence queries considered in exact learning. Earlier, we designed dynamic programming algorithms for the computation of the minimum depth and the minimum number of internal nodes in decision trees that have hypotheses. Modification of these algorithms considered in the present paper permits us to build decision trees with hypotheses that are optimal relative to the depth or relative to the number of the internal nodes. We compare the length and coverage of decision rules extracted from optimal decision trees with hypotheses and decision rules extracted from optimal conventional decision trees to choose the ones that are preferable as a tool for the representation of information. To this end, we conduct computer experiments on various decision tables from the UCI Machine Learning Repository. In addition, we also consider decision tables for randomly generated Boolean functions. The collected results show that the decision rules derived from decision trees with hypotheses in many cases are better than the rules extracted from conventional decision trees.

Learning probabilistic models of hydrogen bond stability from molecular dynamics simulation trajectories.

Chikalov, Igor; Yao, Peggy; Moshkov, Mikhail; Latombe, Jean-Claude.

BMC Bioinformatics ; 12 Suppl 1: S34, 2011 Feb 15.

Artículo en Inglés | MEDLINE | ID: mdl-21342565

RESUMEN

BACKGROUND: Hydrogen bonds (H-bonds) play a key role in both the formation and stabilization of protein structures. They form and break while a protein deforms, for instance during the transition from a non-functional to a functional state. The intrinsic strength of an individual H-bond has been studied from an energetic viewpoint, but energy alone may not be a very good predictor. METHODS: This paper describes inductive learning methods to train protein-independent probabilistic models of H-bond stability from molecular dynamics (MD) simulation trajectories of various proteins. The training data contains 32 input attributes (predictors) that describe an H-bond and its local environment in a conformation c and the output attribute is the probability that the H-bond will be present in an arbitrary conformation of this protein achievable from c within a time duration Δ. We model dependence of the output variable on the predictors by a regression tree. RESULTS: Several models are built using 6 MD simulation trajectories containing over 4000 distinct H-bonds (millions of occurrences). Experimental results demonstrate that such models can predict H-bond stability quite well. They perform roughly 20% better than models based on H-bond energy alone. In addition, they can accurately identify a large fraction of the least stable H-bonds in a conformation. In most tests, about 80% of the 10% H-bonds predicted as the least stable are actually among the 10% truly least stable. The important attributes identified during the tree construction are consistent with previous findings. CONCLUSIONS: We use inductive learning methods to build protein-independent probabilistic models to study H-bond stability, and demonstrate that the models perform better than H-bond energy alone.

Asunto(s)

Enlace de Hidrógeno , Modelos Estadísticos , Simulación de Dinámica Molecular , Proteínas/química , Algoritmos , Estabilidad Proteica , Estructura Secundaria de Proteína

RESUMEN

RESUMEN

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

Detalles de la búsqueda