Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 1 de 1
Filter
Add more filters

Database
Language
Affiliation country
Publication year range
1.
Bioinformatics ; 39(8)2023 08 01.
Article in English | MEDLINE | ID: mdl-37522865

ABSTRACT

MOTIVATION: Random forest is a popular machine learning approach for the analysis of high-dimensional data because it is flexible and provides variable importance measures for the selection of relevant features. However, the complex relationships between the features are usually not considered for the selection and thus also neglected for the characterization of the analysed samples. RESULTS: Here we propose two novel approaches that focus on the mutual impact of features in random forests. Mutual forest impact (MFI) is a relation parameter that evaluates the mutual association of the features to the outcome and, hence, goes beyond the analysis of correlation coefficients. Mutual impurity reduction (MIR) is an importance measure that combines this relation parameter with the importance of the individual features. MIR and MFI are implemented together with testing procedures that generate P-values for the selection of related and important features. Applications to one experimental and various simulated datasets and the comparison to other methods for feature selection and relation analysis show that MFI and MIR are very promising to shed light on the complex relationships between features and outcome. In addition, they are not affected by common biases, e.g. that features with many possible splits or high minor allele frequencies are preferred. AVAILABILITY AND IMPLEMENTATION: The approaches are implemented in Version 0.3.3 of the R package RFSurrogates that is available at github.com/AGSeifert/RFSurrogates and the data are available at doi.org/10.25592/uhhfdm.12620.


Subject(s)
Machine Learning , Random Forest , Bias , Gene Frequency
SELECTION OF CITATIONS
SEARCH DETAIL