Manara - Qatar Research Repository
Browse

Enhancing the Machine Learning Pipeline for a Sustainable Future

Download (16.97 MB)
thesis
submitted on 2024-10-29, 10:31 and posted on 2024-10-30, 04:43 authored by Ashhadul Islam
The energy appetite of AI is a growing environmental concern. Training large machine learning models requires a lot of computational resources, which in turn needs significant energy. This leads to the emission of greenhouse gases that contribute to climate change. Furthermore, the amount of water used to cool GPU clusters is also under scrutiny in addition to carbon emissions. Although the amount of data in the world is finite, AI models can be as large as the memory capacity and computational power used to train them. As these models become larger, there is an associated increase in the costs of computation, carbon emissions, and water consumption. Consequently, there is a need to develop more energy-efficient machine learning algorithms and computational infrastructure in order to minimize the impact on environment. In this thesis, the focus is on pruning neural networks to produce faster and more compact models that are many times smaller in size but still retain the same level of discrimination capability. This approach also improves the model’s resistance to adversarial attacks and reduces data requirements during transfer learning. The methods presented were tested against existing compression techniques using standard models and datasets, which resulted in impressive compression rates and accuracy performance. The thesis also addresses the problem of imbalance in datasets by broaching a novel data oversampling algorithm that is capable of creating new data points to enhance the decision capability of the discriminators. The approach is extended to regression data as well as image data. We compare the efficacy of our algorithm with the state-of-art data oversamplers on benchmark datasets to show the superiority of our algorithm. This method also gives better results in comparison to Generative Adversarial Networks in some cases. The pruning algorithm is made open source at https://github.com/ashhadulislam/SmartPruningDeepNN while the oversampling technique is published as a python library at https://pypi.org/project/knnor/.

History

Language

  • English

Publication Year

  • 2023

License statement

© The author. The author has granted HBKU and Qatar Foundation a non-exclusive, worldwide, perpetual, irrevocable, royalty-free license to reproduce, display and distribute the manuscript in whole or in part in any form to be posted in digital or print format and made available to the public at no charge. Unless otherwise specified in the copyright statement or the metadata, all rights are reserved by the copyright holder. For permission to reuse content, please contact the author.

Institution affiliated with

  • Hamad Bin Khalifa University
  • College of Science and Engineering - HBKU

Degree Date

  • 2023

Degree Type

  • Doctorate

Advisors

Brahim Belhaouari Samir

Committee Members

Zain Abdelwahid Ibrahim Mohamed ; Abdulazeem Abozaid ; Magdy Abita

Department/Program

College of Science & Engineering

Usage metrics

    College of Science and Engineering - HBKU

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC