Manara - Qatar Research Repository
Browse

COUSCOus: improved protein contact prediction using an empirical Bayes covariance estimator

journal contribution
submitted on 2024-09-24, 10:47 and posted on 2024-09-24, 10:48 authored by Reda Rawi, Raghvendra Mall, Khalid Kunji, Mohammed El Anbari, Michael Aupetit, Ehsan Ullah, Halima Bensmail

Background

The post-genomic era with its wealth of sequences gave rise to a broad range of protein residue-residue contact detecting methods. Although various coevolution methods such as PSICOV, DCA and plmDCA provide correct contact predictions, they do not completely overlap. Hence, new approaches and improvements of existing methods are needed to motivate further development and progress in the field. We present a new contact detecting method, COUSCOus, by combining the best shrinkage approach, the empirical Bayes covariance estimator and GLasso.

Results

Using the original PSICOV benchmark dataset, COUSCOus achieves mean accuracies of 0.74, 0.62 and 0.55 for the top L/10 predicted long, medium and short range contacts, respectively. In addition, COUSCOus attains mean areas under the precision-recall curves of 0.25, 0.29 and 0.30 for long, medium and short contacts and outperforms PSICOV. We also observed that COUSCOus outperforms PSICOV w.r.t. Matthew’s correlation coefficient criterion on full list of residue contacts. Furthermore, COUSCOus achieves on average 10% more gain in prediction accuracy compared to PSICOV on an independent test set composed of CASP11 protein targets. Finally, we showed that when using a simple random forest meta-classifier, by combining contact detecting techniques and sequence derived features, PSICOV predictions should be replaced by the more accurate COUSCOus predictions.

Conclusion

We conclude that the consideration of superior covariance shrinkage approaches will boost several research fields that apply the GLasso procedure, amongst the presented one of residue-residue contact prediction as well as fields such as gene network reconstruction.

Other Information

Published in: BMC Bioinformatics
License: https://creativecommons.org/licenses/by/4.0
See article on publisher's website: https://dx.doi.org/10.1186/s12859-016-1400-3

Funding

Open Access funding provided by the Qatar National Library.

History

Language

  • English

Publisher

Springer Nature

Publication Year

  • 2016

License statement

This Item is licensed under the Creative Commons Attribution 4.0 International License.

Institution affiliated with

  • Hamad Bin Khalifa University
  • Qatar Computing Research Institute - HBKU
  • Sidra Medical and Research Center (2015-2017)

Usage metrics

    Qatar Computing Research Institute - HBKU

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC