RESUMO
OBJECTIVES: Serum protein electrophoresis (SPE) in combination with immunotyping (IMT) is the diagnostic standard for detecting monoclonal proteins (M-proteins). However, interpretation of SPE and IMT is weakly standardized, time consuming and investigator dependent. Here, we present five machine learning (ML) approaches for automated detection of M-proteins on SPE on an unprecedented large and well-curated data set and compare the performance with that of laboratory experts. METHODS: SPE and IMT were performed in serum samples from 69,722 individuals from Norway. IMT results were used to label the samples as M-protein present (positive, n=4,273) or absent (negative n=65,449). Four feature-based ML algorithms and one convolutional neural network (CNN) were trained on 68,722 randomly selected SPE patterns to detect M-proteins. Algorithm performance was compared to that of an expert group of clinical pathologists and laboratory technicians (n=10) on a test set of 1,000 samples. RESULTS: The random forest classifier showed the best performance (F1-Score 93.2â¯%, accuracy 99.1â¯%, sensitivity 89.9â¯%, specificity 99.8â¯%, positive predictive value 96.9â¯%, negative predictive value 99.3â¯%) and outperformed the experts (F1-Score 61.2 ± 16.0â¯%, accuracy 89.2 ± 10.2â¯%, sensitivity 94.3 ± 2.8â¯%, specificity 88.9 ± 10.9â¯%, positive predictive value 47.3 ± 16.2â¯%, negative predictive value 99.5 ± 0.2â¯%) on the test set. Interestingly the performance of the RFC saturated, the CNN performance increased steadily within our training set (n=68,722). CONCLUSIONS: Feature-based ML systems are capable of automated detection of M-proteins on SPE beyond expert-level and show potential for use in the clinical laboratory.