Repository-Level Code Generation with Retrieval-Augmented Large Language Model Agents

Main Article Content

Veera Ravindra Divi

Abstract

Most code-generation benchmarks score a model on self-contained functions in a single file. Real software is written inside repositories, where a correct edit depends on cross-file types, project conventions, and executable test suites—a setting today’s public suites largely ignore. We present RepoForge, a benchmark and an open retrieval-augmented agent framework for repository-level code generation. RepoForge contains  executable items mined from  permissively licensed GitHub repositories across five programming languages, spanning four task families: single-file completion, cross-file completion, program repair, and feature implementation. Every item ships with a project context and a hidden test suite, so correctness is measured by execution rather than by textual overlap. The accompanying RepoForge agent couples structure-aware dense retrieval with an iterative reason–retrieve–edit–test loop that grounds generation in the repository. Across six 2023-era code models, retrieval-augmented prompting lifts execution Pass@1 by – points over a no-retrieval baseline, and the RepoForge agent adds a further – points over static dense retrieval, reaching  Pass@1 with GPT-4. We release the benchmark, the harness, and all evaluation scripts to serve as a reproducible baseline for repository-level software engineering with LLMs.

Article Details

How to Cite
Veera Ravindra Divi. (2024). Repository-Level Code Generation with Retrieval-Augmented Large Language Model Agents. International Journal on Recent and Innovation Trends in Computing and Communication, 12(2), 296–301. Retrieved from https://www.ijritcc.org/index.php/ijritcc/article/view/10659
Section
Articles