Manara - Qatar Research Repository
Browse
DOCUMENT
10.1016_j.csi.2024.103917.pdf (2.37 MB)
DOCUMENT
supp_1-s2.0-S0920548924000862-mmc1.docx (1.14 MB)
1/0
2 files

Large language models for code completion: A systematic literature review

journal contribution
submitted on 2024-09-30, 05:12 and posted on 2024-09-30, 05:15 authored by Rasha Ahmad Husein, Hala Aburajouh, Cagatay Catal

Code completion serves as a fundamental aspect of modern software development, improving developers' coding processes. Integrating code completion tools into an Integrated Development Environment (IDE) or code editor enhances the coding process and boosts productivity by reducing errors and speeding up code writing while reducing cognitive load. This is achieved by predicting subsequent tokens, such as keywords, variable names, types, function names, operators, and more. Different techniques can achieve code completion, and recent research has focused on Deep Learning methods, particularly Large Language Models (LLMs) utilizing Transformer algorithms. While several research papers have focused on the use of LLMs for code completion, these studies are fragmented, and there is no systematic overview of the use of LLMs for code completion. Therefore, we aimed to perform a Systematic Literature Review (SLR) study to investigate how LLMs have been applied for code completion so far. We have formulated several research questions to address how LLMs have been integrated for code completion-related tasks and to assess the efficacy of these LLMs in the context of code completion. To achieve this, we retrieved 244 papers from scientific databases using auto-search and specific keywords, finally selecting 23 primary studies based on an SLR methodology for in-depth analysis. This SLR study categorizes the granularity levels of code completion achieved by utilizing LLMs in IDEs, explores the existing issues in current code completion systems, how LLMs address these challenges, and the pre-training and fine-tuning methods employed. Additionally, this study identifies open research problems and outlines future research directions. Our analysis reveals that LLMs significantly enhance code completion performance across several programming languages and contexts, and their capability to predict relevant code snippets based on context and partial input boosts developer productivity substantially.

Other Information

Published in: Computer Standards & Interfaces
License: http://creativecommons.org/licenses/by/4.0/
See article on publisher's website: https://dx.doi.org/10.1016/j.csi.2024.103917

Funding

Open Access funding provided by the Qatar National Library.

History

Language

  • English

Publisher

Elsevier

Publication Year

  • 2024

License statement

This Item is licensed under the Creative Commons Attribution 4.0 International License.

Institution affiliated with

  • Qatar University
  • College of Engineering - QU

Usage metrics

    Qatar University

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC