Manara - Qatar Research Repository
Browse
DOCUMENT
10.1016_j.ymeth.2024.08.001.pdf (1.91 MB)
DOCUMENT
supp_1-s2.0-S1046202324001828-mmc1.docx (413 kB)
1/0
2 files

StackDPPred: Multiclass prediction of defensin peptides using stacked ensemble learning with optimized features

journal contribution
submitted on 2024-09-30, 09:09 and posted on 2024-09-30, 09:12 authored by Muhammad Arif, Saleh Musleh, Ali Ghulam, Huma Fida, Yasser Alqahtani, Tanvir Alam

Host defense or antimicrobial peptides (AMPs) are promising candidates for protecting host against microbial pathogens for example bacteria, virus, fungi, yeast. Defensins are the type of AMPs that act as potential therapeutic drug agent and perform vital role in various biological process. Conventional Experiments to identify defensin peptides (DPs) are time consuming and expensive. Thus, the shortcomings of wet lab experiments are leveraged by computational methods to accurately predict the functional types of DPs. In this paper, we aim to propose a novel multi-class ensemble-based prediction model called StackDPPred for identifying the properties of DPs. The peptide sequences are encoded using split amino acid composition (SAAC), segmented position specific scoring matrix (SegPSSM), histogram of oriented gradients-based PSSM (HOGPSSM) and feature extraction based graphical and statistical (FEGS) descriptors. Next, principal component analysis (PCA) is used to select the best subset of attributes. After that, the optimized features are fed into single machine learning and stacking-based ensemble classifiers. Furthermore, the ablation study demonstrates the robustness and efficacy of the stacking approach using reduced features for predicting DPs and their families. The proposed StackDPPred method improves the overall accuracy by 13.41% and 7.62% compared to existing DPs predictors iDPF-PseRAAC and iDEF-PseRAAC, respectively on validation test. Additionally, we applied the local interpretable model-agnostic explanations (LIME) algorithm to understand the contribution of selected features to the overall prediction. We believe, StackDPPred could serve as a valuable tool accelerating the screening of large-scale DPs and peptide-based drug discovery process.

Other Information

Published in: Methods
License: http://creativecommons.org/licenses/by/4.0/
See article on publisher's website: https://dx.doi.org/10.1016/j.ymeth.2024.08.001

Funding

Open Access funding provided by the Qatar National Library.

History

Language

  • English

Publisher

Elsevier

Publication Year

  • 2024

License statement

This Item is licensed under the Creative Commons Attribution 4.0 International License.

Institution affiliated with

  • Hamad Bin Khalifa University
  • College of Science and Engineering - HBKU