RESUMO
Identification of novel BRCA1 variants outpaces their clinical annotation which highlights the importance of developing accurate computational methods for risk assessment. Therefore our aim was to develop a BRCA1-specific machine learning model to predict the pathogenicity of all types of BRCA1 variants and to apply this model and our previous BRCA2-specific model to assess BRCA variants of uncertain significance (VUS) among Qatari patients with breast cancer. We developed an XGBoost model that utilizes variant information such as position frequency and consequence as well as prediction scores from numerous in silico tools. We trained and tested the model with BRCA1 variants that were reviewed and classified by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium. In addition we tested the model's performance on an independent set of missense variants of uncertain significance with experimentally determined functional scores. The model performed excellently in predicting the pathogenicity of ENIGMA-classified variants (accuracy: 99.9%) and in predicting the functional consequence of the independent set of missense variants (accuracy: 93.4%). Moreover it predicted 2 115 potentially pathogenic variants among the 31 058 unreviewed BRCA1 variants in the BRCA exchange database. Using two BRCA-specific models we did not identify any pathogenic BRCA1 variants among those found in patients in Qatar but predicted four potentially pathogenic BRCA2 variants, which could be prioritized for functional validation.