Gene-Specific Models to Predict the Pathogenicity of BRCA1 and BRCA2 Variants
Identification of novel BRCA variants outpace their clinical annotation, which highlights the importance of developing accurate computational methods for risk assessment. However, the use of different combinations of in silico algorithms is a major source of inconsistencies in variant classification. Moreover, the existing BRCA-specific prediction algorithms focus on predicting the functional impact of only a subtype of variants. General variant effect predictors are applicable to all subtypes but are trained on putative benign and pathogenic variants and do not account for gene-specific information, such as hotspots of pathogenic variants; therefore, our aim was to develop BRCA1 and BRCA2-specific machine learning models to predict pathogenicity of all types of BRCA variants. We developed two XGBoost-based models that utilizes variant information such as position, frequency, consequence, and other features, as well as prediction scores from numerous in silico tools. We trained the model on 80% of the expert reviewed variants by the Evidence-Based Network for the Interpretation of Germline Mutant Alleles (ENIGMA) consortium and tested its performance on the remaining 20%, as well as on an independent set of variants of uncertain significance with experimentally determined functional scores.
The novel BRCA gene-specific models performed excellently and predicted the pathogenicity of ENIGMA variants with an accuracy of 99.9%. The models also performed excellently on predicting the functional consequence of the independent set of variants with an accuracy up to 93.4% for BRCA1-model and 91.3% for BRCA2-model and. Those new, BRCA1 and BRCA2 gene-specific models can be used to combine the different prediction tools and thereby eliminate classification inconsistencies and to prioritize unreviewed variants for functional analysis or expert review.
History
Language
- English
Publication Year
- 2022
License statement
© The author. The author has granted HBKU and Qatar Foundation a non-exclusive, worldwide, perpetual, irrevocable, royalty-free license to reproduce, display and distribute the manuscript in whole or in part in any form to be posted in digital or print format and made available to the public at no charge. Unless otherwise specified in the copyright statement or the metadata, all rights are reserved by the copyright holder. For permission to reuse content, please contact the author.Institution affiliated with
- Hamad Bin Khalifa University
- College of Health and Life Sciences - HBKU
Degree Date
- 2022
Degree Type
- Doctorate