Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 3 de 3
Filter
Add more filters










Database
Language
Publication year range
1.
Accid Anal Prev ; 150: 105936, 2021 Feb.
Article in English | MEDLINE | ID: mdl-33338913

ABSTRACT

The crash data are often predominantly imbalanced, among which the fatal injury (or minority) crashes are significantly underrepresented relative to the non-fatal injury (or majority) ones. This unbalanced phenomenon poses a huge challenge to most of the statistical learning methods and needs to be addressed in the data preprocessing. To this end, we comparatively apply three data balance methods, i.e., the Synthetic Minority Oversampling Technique (SMOTE), the Borderline SMOTE (BL-SMOTE), and the Majority Weighted Minority Oversampling (MWMOTE). Then, we examine different Bayesian networks (BNs) to explore the contributing factors of fatal injury crashes. The 2016 highway crash data of Ghana are retrieved for the case study. The results show that the accuracy of the injury severity classification is improved by using the preprocessed data. Highest improvement is observed on the data preprocessed by the MWMOTE technique. Statistical verification is done by the Wilcoxon signed-rank test. The inference results of the best BNs show the significant factors of fatal crashes which include off-peak time, non-intersection area, pedestrian involved collisions, rural road environment, good tarred road, roads without shoulders, and multiple vehicles involved crash.


Subject(s)
Pedestrians , Wounds and Injuries , Accidents, Traffic , Bayes Theorem , Ghana/epidemiology , Humans , Rural Population , Wounds and Injuries/epidemiology
2.
Accid Anal Prev ; 151: 105851, 2021 Mar.
Article in English | MEDLINE | ID: mdl-33383521

ABSTRACT

The study aims to identify relevant variables to improve the prediction performance of the crash injury severity (CIS) classification model. Unfortunately, the CIS database is invariably characterized by the class imbalance. For instance, the samples of multiple fatal injury (MFI) severity class are typically rare as opposed to other classes. The imbalance phenomenon may introduce a prediction bias in favour of the majority class and affect the quality of the learning algorithm. The paper proposes an ensemble-based variable ranking scheme that incorporates the data resampling. At the data pre-processing level, majority weighted minority oversampling (MWMOTE) is employed to treat the imbalanced training data. Ensemble of classifiers induced from the balanced data is used to evaluate and rank the individual variables according to their importance to the injury severity prediction. The relevant variables selected are then applied to the balanced data to form a training set for the CIS classification modelling. An empirical comparison is conducted through considering the variable ranking by: 1) the learning of single inductive algorithm with imbalanced data where the relevant variables are applied to the imbalanced data to form the training data; 2) the learning of single inductive algorithm with MWMOTE data and the relevant variables identified are applied to the balanced data to form the training data; and 3) the learning of ensembles with imbalanced data where the relevant variables identified are applied to the imbalanced data to form the training data. Bayesian Networks (BNs) classifiers are then developed for each ranking method, where nested subsets of the top ranked variables are adopted. The model predictions are captured in four performance indicators in the comparative study. Based on three-year (2014-2016) crash data in Ghana, the empirical results show that the proposed method is effective to identify the most prolific predictors of the CIS level. Finally, based on the inference results of BNs developed on the best subset, the study offers the most probable explanations to the occurrence of MFI crashes in Ghana.


Subject(s)
Accidents, Traffic/statistics & numerical data , Algorithms , Accidents, Traffic/prevention & control , Bayes Theorem , Databases, Factual , Ghana/epidemiology , Humans
3.
Int J Inj Contr Saf Promot ; 27(3): 266-275, 2020 Sep.
Article in English | MEDLINE | ID: mdl-32233749

ABSTRACT

The quality of vehicular collision data is crucial for studying the relationship between injury severity and collision factors. Misclassified injury severity data in the crash dataset, however, may cause inaccurate parameter estimates and consequently lead to biased conclusions and poorly designed countermeasures. This is particularly true for imbalanced data where the number of samples in one class far outnumber the other. To improve the classification performance of the injury severity, the paper presents a robust noise filtering technique to deal with the mislabels in the imbalanced crash dataset using the advanced machine learning algorithms. We examine the state-of-the-art filtering algorithms, including Iterative Noise Filtering based on the Fusion of Classifiers (INFFC), Iterative Partitioning Filter (IPF), and Saturation Filter (SatF). In the case study of Cairo (Egypt), the empirical results show that: (1) the mislabels in crash data significantly influence the injury severity predictions, and (2) the proposed M-IPF filter outperforms its counterparts in terms of the effectiveness and efficiency in eliminating the mislabels in crash data. The test results demonstrate the efficacy of the M-IPF in handling the data noise and mitigating the impacts thereof.


Subject(s)
Accidents, Traffic , Machine Learning , Quality Improvement , Triage/standards , Adolescent , Adult , Databases, Factual , Egypt , Female , Humans , Male , Middle Aged , Occupational Injuries , Trauma Severity Indices , Wounds and Injuries , Young Adult
SELECTION OF CITATIONS
SEARCH DETAIL
...