Evaluation of metrics for code generation

This is a replication package for our work on evaluation of code generation metris.

Library for computation of code generation metrics is available on PyPi
- pip install codegen-metrics
Pre-print is available on arXiv
The article is to be published soon in Journal of Systems and Software

Setup

We use poetry to manage the environment and library versions.

You can find the installation manual here.
Run poetry install to setup the environment.

To run grading scripts you will also need to install tkinter.

For linux users: sudo apt-get install python3-tk.
For Mac users: brew install [email protected]

To run metric computations, you will also need tree-sitter.

To use it, run git clone https://github.com/tree-sitter/tree-sitter-python.git build/tree-sitter-python.
- To make sure that you use the right version of tree, checkout the specific version:
- cd build/tree-sitter-python && git checkout 9e53981

Repository structure

We expect all scripts to be run from the root directory of this repository.

metrics_evaluation/grading contains Python scripts that run simple GUI for grading HS and Conala datasets
metrics_evaluation/metrics contains code to run all the metrics studied in our work. For usage examples refer to 02-compute-metrics.ipynb
metrics_evaluation/metrics contains code for bootstrapping and analysis. It is further used in 03-bootstrap.ipynb
data directory contains all the data: intentions, generations from all models, human grades, etc.

Cite as

@article{evtikhiev2023metrics,
title = {Out of the BLEU: How should we assess quality of the Code Generation models?},
journal = {Journal of Systems and Software},
pages = {111741},
year = {2023},
issn = {0164-1212},
doi = {https://doi.org/10.1016/j.jss.2023.111741},
url = {https://www.sciencedirect.com/science/article/pii/S016412122300136X},
author = {Mikhail Evtikhiev and Egor Bogomolov and Yaroslav Sokolov and Timofey Bryksin},
keywords = {Code generation, Metrics, Neural networks, Code similarity},
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
build		build
data		data
metrics_evaluation		metrics_evaluation
01-collect-grades.ipynb		01-collect-grades.ipynb
02-compute-metrics.ipynb		02-compute-metrics.ipynb
03-bootstrap.ipynb		03-bootstrap.ipynb
LICENSE		LICENSE
README.md		README.md
codex.ipynb		codex.ipynb
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evaluation of metrics for code generation

Setup

Repository structure

Cite as

About

Uh oh!

Releases

Packages

Languages

License

JetBrains-Research/codegen-metrics

Folders and files

Latest commit

History

Repository files navigation

Evaluation of metrics for code generation

Setup

Repository structure

Cite as

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages