Performance of the supervised learning algorithms in sex estimation of the proximal femur: A comparative study in contemporary Egyptian and Turkish samples.

H Attia, MennattAllah; H Attia, Mohamed; Tarek Farghaly, Yasmin; Ahmed El-Sayed Abulnoor, Bassam; Curate, Francisco

H Attia, MennattAllah; H Attia, Mohamed; Tarek Farghaly, Yasmin; Ahmed El-Sayed Abulnoor, Bassam; Curate, Francisco.

Afiliação

H Attia M; Forensic Medicine and Clinical Toxicology, Faculty of Medicine, Alexandria University, Alexandria, Egypt. Electronic address: mennatt.hassan@alexmed.edu.eg.
H Attia M; Biomedical Engineering, Medical Research Institute, Alexandria University, Egypt; Institute for Intelligent Systems Research and Innovation, Deakin University, Australia. Electronic address: attiamohammed@hotmail.com.
Tarek Farghaly Y; Diagnostic Radiology, Faculty of Medicine, Alexandria University, Egypt.
Ahmed El-Sayed Abulnoor B; Fixed Prosthodontics, Faculty of Dentistry, Ain Shams University, Egypt.
Curate F; University of Coimbra, Research Centre for Anthropology and Health, Department of Life Sciences, Coimbra, Portugal; University of Coimbra, Laboratory of Forensic Anthropology, Department of Life Sciences, Coimbra, Portugal. Electronic address: fcurate@uc.pt.

Sci Justice ; 62(3): 288-309, 2022 05.

Article em En | MEDLINE | ID: mdl-35598923

ABSTRACT

ABSTRACT

Sex estimation standards are population specific however, we argue that machine learning techniques (ML) may enhance the biological sex determination on trans-population application. Linear discriminant analysis (LDA) versus nine ML including quadratic discriminant analysis (QDA), support vector machine (SVM), Decision Tree (DT), Gaussian process (GPC), Naïve Bayesian (NBC), K-Nearest Neighbor (KNN), Random Forest (RFM) and Adaptive boosting (Adaboost) were compared. The experiments involve two contemporary populations Turkish (n = 300) and Egyptian populations (n = 100) for training and validation, respectively. Base models were calibrated using isotonic and sigmoid calibration schemes. Results were analyzed at posterior probabilities (pp) thresholds >0.95 and >0.80. At pp = 0.5, ML algorithms yielded comparable accuracies in the training (90% to 97%) and test sets (81% to 88%) which are not modified after employing the calibration techniques. At pp >0.95, the raw RFM, LDA, QDA, and SVM models have shown the best performance however, calibration techniques improved the performance of various classifier especially NBC and Adaboost. By contrast, the performance of GPC, KNN, QDA models worsened by calibration. RFM has shown the best performance among all models at both thresholds whereas LDA benefited the best from using both calibration methods at pp >0.80. Complex ML models are not necessarily achieving better performance metrics. LDA and QDA remain the fastest and simplest classifiers. We demonstrated the capability of enhancing sex estimation using ML on an independent population sample however, differences in the underlying probability distribution generated by models were detected which warranted more cautious application by forensic practitioners.

Assuntos

Algoritmos; Máquina de Vetores de Suporte; Teorema de Bayes; Egito; Fêmur; Humanos

Palavras-chave

Contemporary metapopulations skeletal database; Femur sexual dimorphism; Forensic anthropology; Regional sex estimation standards; Supervised machine learning algorithms

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google

Texto completo: 1 Base de dados: MEDLINE Assunto principal: Algoritmos / Máquina de Vetores de Suporte Tipo de estudo: Prognostic_studies Limite: Humans País como assunto: Africa Idioma: En Ano de publicação: 2022 Tipo de documento: Article

Texto completo

Imprimir

XML

PubMed Links

Buscar no Google