Project-Level Encoding for Neural Source Code Summarization of Subroutines

Bansal, Aakash; Haque, Sakib; McMillan, Collin

Abstract:Source code summarization of a subroutine is the task of writing a short, natural language description of that subroutine. The description usually serves in documentation aimed at programmers, where even brief phrase (e.g. "compresses data to a zip file") can help readers rapidly comprehend what a subroutine does without resorting to reading the code itself. Techniques based on neural networks (and encoder-decoder model designs in particular) have established themselves as the state-of-the-art. Yet a problem widely recognized with these models is that they assume the information needed to create a summary is present within the code being summarized itself - an assumption which is at odds with program comprehension literature. Thus a current research frontier lies in the question of encoding source code context into neural models of summarization. In this paper, we present a project-level encoder to improve models of code summarization. By project-level, we mean that we create a vectorized representation of selected code files in a software project, and use that representation to augment the encoder of state-of-the-art neural code summarization techniques. We demonstrate how our encoder improves several existing models, and provide guidelines for maximizing improvement while controlling time and resource costs in model size.

Comments:	10 pages + 2 for references. Accepted at ICPC 2021
Subjects:	Software Engineering (cs.SE); Machine Learning (cs.LG)
Cite as:	arXiv:2103.11599 [cs.SE]
	(or arXiv:2103.11599v1 [cs.SE] for this version)
	http://doi.org/10.48550/arXiv.2103.11599

Computer Science > Software Engineering

Title:Project-Level Encoding for Neural Source Code Summarization of Subroutines

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators