Manara - Qatar Research Repository
Browse

Corpus-Based Quality Evaluation of Ar-En Neural Machine Translation: Google Translate as a Case Study

Download (2.31 MB)
thesis
submitted on 2024-12-17, 12:34 and posted on 2024-12-29, 07:44 authored by Sohaila M. Elkaffash
This thesis investigates the quality of neural network-based machine translation (NMT) in the language pair: Arabic and English, taking Google Translate as a case study. A corpus-based approach is applied in the data collection and analysis processes. The texts examined in the case study fall under the academic genre. The latest TAUS’ Dynamic Quality Framework (DQF) is preliminarily employed for the implementation of the investigation of the translated output of these texts. It is also essential to point out that this model is an integrated version with Multidimensional Quality Metrics. Further, this model is modified and customized to suit the quality variables and criteria of each language on one hand, and of the translation process of this language pair on the other. Some sub-error types were added and some others were omitted. Moreover, the model’s severity scale was replaced by a translation-based one (includes respectively in a descending manner: ‘Mistranslation’, ‘Untranslated Texts’, ‘Under-translation’, ‘Over-translation’, ‘Neutral’). ‘Neutral’ errors are considered ‘textual’ errors that affect the functionality of the text in the target language (TL) but do not necessarily affect the translational aspect of the target text (TT). This scale readily locates the errors within the engine’s translation process and indicate whether the deficiency is in the machine capabilities in language analysis and generation or in the translation process itself. The results identified show that the modified evaluation model is robust and effectively works with this language pairs. In addition, the error analysis devotes a close view on the grammatical errors. The study concludes with a number of recommendations for future research as well as for applications for MT development.

History

Language

  • English

Publication Year

  • 2020

License statement

© The author. The author has granted HBKU and Qatar Foundation a non-exclusive, worldwide, perpetual, irrevocable, royalty-free license to reproduce, display and distribute the manuscript in whole or in part in any form to be posted in digital or print format and made available to the public at no charge. Unless otherwise specified in the copyright statement or the metadata, all rights are reserved by the copyright holder. For permission to reuse content, please contact the author.

Institution affiliated with

  • Hamad Bin Khalifa University
  • College of Humanities and Social Sciences - HBKU

Degree Date

  • 2020

Degree Type

  • Master's

Advisors

Hendrik Kockaert; Ahmed AbdelAli

Committee Members

Ahmed Alaoui; Abied Alsulaiman; Hassan Hakimian

Department/Program

College of Humanities and Social Sciences

Usage metrics

    College of Humanities and Social Sciences - HBKU

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC