Manara - Qatar Research Repository
Browse

Improving Trainability of ML-based Intrusion Detection Models Through Data Augmentation using Generative Adversarial Network (GAN) in a Smart Grid Environment

Download (1.4 MB)
thesis
submitted on 2025-06-18, 08:55 and posted on 2025-06-18, 08:57 authored by Mohammed Abdulla J. M. Alsooj

This thesis explores the evolving cybersecurity challenges faced by Supervisory Control and Data Acquisition (SCADA) communication systems within the power generation, transmission, and distribution networks, and examines the current threat plots and continued security concerns. A thorough analysis of existing cybersecurity vulnerabilities of substation IEC standards that tackle the communication between SCADA servers reveals potential gaps and proposes enhancements to the IDS system's security and performance. In particular, we focused on SCADA communication protocols within electrical plant substations, namely IEC 61850, and explored machine learning-based security solutions. We adopted a dataset with five different attacks, namely: inside-substation attack, connection Loss Attack, modification attack, scanning attack, and interruption Attack, which we used to train an intrusion detection model to detect malicious traffic. However, machine learning models typically require large datasets to be trained for improved accuracy, which is usually not readily available in the case of SCADA systems. Therefore, we proposed a generative model to generate synthesized samples to augment the original dataset, leading to an increase in samples per attack and ultimately improving the performance of the intrusion detection model. Three primary scenarios were used to generate the new samples. In the first scenario, we provide constant increments of data generation. In the second, we provide equal samples per label data generation.

Finally, in the third scenario, we provide equal ratio samples for traffic type. The result obtained an improvement from 73.78% (in the case of the original dataset) to 82.31% in the first scenario, 99.6% in the second scenario, and 99.88% in the third scenario. The generated dataset can also be used for further studies by researchers aiming to support electrical plant smart grid systems for security implications.

History

Language

  • English

Publication Year

  • 2024

License statement

© The author. The author has granted HBKU and Qatar Foundation a non-exclusive, worldwide, perpetual, irrevocable, royalty-free license to reproduce, display and distribute the manuscript in whole or in part in any form to be posted in digital or print format and made available to the public at no charge. Unless otherwise specified in the copyright statement or the metadata, all rights are reserved by the copyright holder. For permission to reuse content, please contact the author.

Institution affiliated with

  • Hamad Bin Khalifa University
  • College of Science and Engineering - HBKU

Degree Date

  • 2024

Degree Type

  • Master's

Advisors

Saif M. Al-Kuwari

Committee Members

Mohamed Abdullah | Samir Brahim Belhaouari | Mounir Hamdi

Department/Program

College of Science and Engineering

Usage metrics

    College of Science and Engineering - HBKU

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC