MergeSE — Post-hoc Model Merging for Software Engineering

Tool artifact for the ASE 2026 Tool-Track submission "MergeSE: Post-hoc Model Merging for Software Engineering Tasks Without Retraining"

The underlying model-merging approach comes from our research-track submission (under review): "A Unified Model for Cross-Domain Clone Detection via Model Merging"

MergeSE merges fine-tuned HuggingFace encoder checkpoints (CodeBERT, GraphCodeBERT, UniXcoder, CodeT5-encoder, …) into a single model without any training data, and evaluates the result on standard software-engineering benchmarks. It ships as a single-file CLI and a web tool that share the same engine.

It implements:

TIES-Merging (Yadav et al., NeurIPS 2023)
DARE-TIES (Yu et al., 2024)
Task-vector averaging (Ilharco et al., 2022)

Plus end-to-end evaluation across the full range of SE classification tasks (clone detection, vulnerability detection, defect prediction, code-smell detection, commit classification, code-review acceptability, comment-code consistency, exception-type prediction, type inference, and any custom CSV) and one-command export to HuggingFace / ONNX / TorchScript.

Three ways to use MergeSE

Pick whichever fits your environment — all three sit on top of the same merging engine, so results are identical.

#	Path	Best for
1	Web tool via Docker	Easiest setup. One command brings up the UI and REST API.
2	Web tool without Docker	Same UI, but you'd rather run Flask directly in a venv.
3	CLI tool	Scripting, headless servers, reproducible runs, paper-grade evaluation.

Full references: docs/WEB.md and docs/CLI.md.

Quickstart

git clone https://github.com/srlabUsask/MergeSE.git
cd MergeSE

1. Web tool with Docker

docker compose up -d --build      # → http://localhost:8765

Open http://localhost:8765 in your browser. Stop with docker compose down.

Common first-run issues:

permission denied … /var/run/docker.sock — your user isn't in the docker group. One-time fix: sudo usermod -aG docker $USER, then open a fresh shell. Or prefix the one-off command with sudo.
address already in use … 8765 — something else is bound to port 8765. Either stop it (sudo ss -ltnp | grep :8765 to find the PID), or remap the host port in docker-compose.yml (e.g. "127.0.0.1:8766:8765") and use http://localhost:8766 instead.

2. Web tool without Docker (Flask in a venv)

python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install --index-url https://download.pytorch.org/whl/cpu torch
pip install ".[server,datasets]"

python server/app.py               # → http://localhost:8765

For production, swap python server/app.py for gunicorn -c deploy/gunicorn.conf.py server.app:app.

3. CLI tool

python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install --index-url https://download.pytorch.org/whl/cpu torch
pip install .

mergese inspect ./model_a ./model_b --base microsoft/codebert-base
mergese merge   ./model_a ./model_b --base microsoft/codebert-base \
                --method ties --output ./merged
mergese evaluate ./merged --task clone_detection --test-file ./test.csv
mergese export   ./merged --format onnx --output ./merged.onnx

This installs a mergese console script on your $PATH. To run without installing, use python mergese.py … after pip install -r requirements.txt.

Supported SE tasks

Task	Input	Metric	Known benchmarks
Clone detection	pair	binary F1	BigCloneBench, CLCDSA, GPTCloneBench, POJ-104
Vulnerability detection	single	binary F1	Devign, ReVeal, Big-Vul, D2A, Draper
Defect / bug prediction	single	binary F1	Defects4J, PROMISE, CodeXGLUE-Defect
Code-smell detection	single	binary F1	MLCQ, Qualitas
Commit classification	single	macro F1	CommitBench
Code-review acceptability	pair	binary F1	CodeReview
Comment-code consistency	pair	binary F1	comment-consistency datasets
Exception-type prediction	single	macro F1	CodeXGLUE-Exception
Type inference (closed)	single	macro F1	Typilus, Type4Py, ManyTypes4Py
Custom (any CSV)	auto	auto	—

mergese tasks (CLI) or GET /api/tasks (web) returns the same list.

Cross-task merging

When models have differently-shaped classifier heads (e.g. a 2-class clone detector + a 10-class commit classifier), MergeSE auto-detects the mismatch and runs an encoder-only merge. The base's head is preserved so you can attach a fresh task-specific head downstream. Force this with --encoder-only, or override with --include-heads.

Repository layout

MergeSE/
├── mergese.py              # the entire CLI (single file)
├── mergese_tasks.py        # task registry
├── server/
│   ├── app.py              # Flask backend
│   └── presets.json        # example workflows
├── frontend/
│   ├── index.html
│   ├── styles.css
│   ├── app.js
│   └── favicon.svg
├── deploy/
│   ├── nginx.conf          # reverse-proxy site
│   ├── mergese.service     # systemd unit
│   └── gunicorn.conf.py
├── data/
│   └── benchmarks/         # 200-row bundled samples + index.json
├── tests/
│   ├── test_merge_math.py
│   └── test_tasks_and_heads.py
├── docs/
│   ├── CLI.md              # full CLI reference
│   └── WEB.md              # full web-tool reference
├── pyproject.toml
├── requirements.txt
├── Dockerfile
└── docker-compose.yml

Bundled benchmark samples

Name	Rows	Task	Source
`bundled://bigclonebench`	200 (100/100)	clone detection (Java)	CodeXGLUE / BigCloneBench
`bundled://clcdsa`	200 (100/100)	cross-language clones (Java↔Python)	CLCDSA Source Codes
`bundled://gptclonebench`	200 (100/100)	semantic clones (Java)	GPTCloneBench standalone

These are sampled from the original benchmarks for smoke-testing only. For paper-grade numbers, point --test-file at the full dataset.

Citing

If you use MergeSE itself, please cite the tool paper:

@inproceedings{roy2026mergese,
  author    = {Palash R. Roy and Banani Roy and Chanchal K. Roy and Kevin A. Schneider},
  title     = {MergeSE: Post-hoc Model Merging for Software Engineering Tasks Without Retraining},
  booktitle = {Proc. ASE Tool Track},
  year      = {2026}
}

If you use the merging methodology MergeSE packages, please also cite our research-track paper:

@inproceedings{roy2026unified,
  author    = {Palash R. Roy and Banani Roy and Chanchal K. Roy and Kevin A. Schneider},
  title     = {A Unified Model for Cross-Domain Clone Detection via Model Merging},
  booktitle = {Proc. ASE},
  year      = {2026}
}

License

Apache-2.0. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MergeSE — Post-hoc Model Merging for Software Engineering

Three ways to use MergeSE

Quickstart

1. Web tool with Docker

2. Web tool without Docker (Flask in a venv)

3. CLI tool

Supported SE tasks

Cross-task merging

Repository layout

Bundled benchmark samples

Citing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.vscode		.vscode
data/benchmarks		data/benchmarks
deploy		deploy
docs		docs
frontend		frontend
server		server
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
mergese.py		mergese.py
mergese_tasks.py		mergese_tasks.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

MergeSE — Post-hoc Model Merging for Software Engineering

Three ways to use MergeSE

Quickstart

1. Web tool with Docker

2. Web tool without Docker (Flask in a venv)

3. CLI tool

Supported SE tasks

Cross-task merging

Repository layout

Bundled benchmark samples

Citing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages