Skip to content

Commit 94ccfaa

Browse files
committed
Add ladder code
1 parent a55ca46 commit 94ccfaa

File tree

3 files changed

+192
-0
lines changed

3 files changed

+192
-0
lines changed

configs/ablations/ladder/README.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
# CC:Ladder
2+
3+
For a more static and hill-climb-able version of CodeClash, we introduce CC:Ladder - for each arena, we curate a collection of human-written solutions, determine their relative rankings, and then see how "high up" the ladder models can climb.
4+
5+
For instance, for RobotRumble, we created a ladder by doing the following steps:
6+
1. From the online [leaderboard](https://robotrumble.org/boards/2), we manually crawled all open source, published bots and pushed them as branches to the [CC:RobotRumble](https://github.com/CodeClash-ai/RobotRumble) repository.
7+
2. We then created the `robotrumble.yaml` file in this folder.
8+
3. Next, from the repository root, we run `uv run python scripts/run_ladder.py configs/ablations/ladder/robotrumble.yaml`, which runs PvP Tournaments against all pairs of branches.
9+
4. From these logs, we then calculate win rate to rank all models.
10+
11+
You can follow these steps to create your own "CC:<arena>" ladder.
12+
The tricky part is typically finding a large collection of human solutions for a particular arena.
13+
We've typically found that googling for online leaderboards or awesome-<arena> repositories (e.g. [BattleSnake](https://github.com/BattlesnakeOfficial/awesome-battlesnake)) is a good strategy.
Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
tournament:
2+
rounds: 0
3+
game:
4+
name: RobotRumble
5+
sims_per_round: 250
6+
args:
7+
raw: false
8+
players:
9+
- agent: dummy
10+
branch_init: human/aaa/jippty5
11+
- agent: dummy
12+
branch_init: human/aaoutkine/dark-knight
13+
- agent: dummy
14+
branch_init: human/aaoutkine/school-bot
15+
- agent: dummy
16+
branch_init: human/aaoutkine/silo34
17+
- agent: dummy
18+
branch_init: human/aayyad/testbot
19+
- agent: dummy
20+
branch_init: human/anton/anton3000
21+
- agent: dummy
22+
branch_init: human/anton/anton4000
23+
- agent: dummy
24+
branch_init: human/anton/om-om
25+
- agent: dummy
26+
branch_init: human/anton/wallifier
27+
- agent: dummy
28+
branch_init: human/atl15/centerrr
29+
- agent: dummy
30+
branch_init: human/clay/diag-lattice
31+
- agent: dummy
32+
branch_init: human/devchris/black_magic
33+
- agent: dummy
34+
branch_init: human/devchris/first_test
35+
- agent: dummy
36+
branch_init: human/edward/flail
37+
- agent: dummy
38+
branch_init: human/entropicdrifter/gigachad
39+
- agent: dummy
40+
branch_init: human/entropicdrifter/glommer
41+
- agent: dummy
42+
branch_init: human/entropicdrifter/glommerv2
43+
- agent: dummy
44+
branch_init: human/entropicdrifter/seven-of-nine
45+
- agent: dummy
46+
branch_init: human/entropicdrifter/we-are-borg
47+
- agent: dummy
48+
branch_init: human/essickmango/fruity-test
49+
- agent: dummy
50+
branch_init: human/essickmango/pickle-up
51+
- agent: dummy
52+
branch_init: human/gerenuk/gere-ape
53+
- agent: dummy
54+
branch_init: human/happysquid/test
55+
- agent: dummy
56+
branch_init: human/jammyliu/sixty-nine-line
57+
- agent: dummy
58+
branch_init: human/jay0jayjay/naivestarter
59+
- agent: dummy
60+
branch_init: human/jiricodes/jiricodes-bot
61+
- agent: dummy
62+
branch_init: human/kalkin/artemis
63+
- agent: dummy
64+
branch_init: human/kalkin/artemis2
65+
- agent: dummy
66+
branch_init: human/kalkin/maxad
67+
- agent: dummy
68+
branch_init: human/ketza/arthur
69+
- agent: dummy
70+
branch_init: human/ketza/bob
71+
- agent: dummy
72+
branch_init: human/lanity/sivuy
73+
- agent: dummy
74+
branch_init: human/ldang/nemo
75+
- agent: dummy
76+
branch_init: human/ldang/nessy
77+
- agent: dummy
78+
branch_init: human/luisa/baselinegere
79+
- agent: dummy
80+
branch_init: human/luisa/luisasrobot
81+
- agent: dummy
82+
branch_init: human/mario31313/alpha_13
83+
- agent: dummy
84+
branch_init: human/mee42/follow-bot
85+
- agent: dummy
86+
branch_init: human/mitch84/crw_preempt
87+
- agent: dummy
88+
branch_init: human/mitch84/retreat_walk2
89+
- agent: dummy
90+
branch_init: human/mitch84/walk_retreat
91+
- agent: dummy
92+
branch_init: human/mjburgess/rule99
93+
- agent: dummy
94+
branch_init: human/mkap/test
95+
- agent: dummy
96+
branch_init: human/mountain/neuralbot1-1h
97+
- agent: dummy
98+
branch_init: human/mountain/neuralbot2-6h
99+
- agent: dummy
100+
branch_init: human/mountain/neuralbot4-3h
101+
- agent: dummy
102+
branch_init: human/mousetail/coward-bot
103+
- agent: dummy
104+
branch_init: human/mousetail/genetic-robot
105+
- agent: dummy
106+
branch_init: human/navster8/bash-brothers
107+
- agent: dummy
108+
branch_init: human/navster8/maginot-line
109+
- agent: dummy
110+
branch_init: human/sbasu3/meek-bot
111+
- agent: dummy
112+
branch_init: human/sivecano/clouded-mind
113+
- agent: dummy
114+
branch_init: human/suddenlyseals/control-center
115+
- agent: dummy
116+
branch_init: human/tabaxi3k/black-magic-1
117+
- agent: dummy
118+
branch_init: human/tabaxi3k/charles
119+
- agent: dummy
120+
branch_init: human/thesmilingturtl/naivefaa
121+
- agent: dummy
122+
branch_init: human/underscore/bot1
123+
- agent: dummy
124+
branch_init: human/wolfsleuth/simple
125+
prompts:
126+
game_description: |-
127+
RobotRumble ladder

scripts/run_ladder.py

Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
import argparse
2+
from pathlib import Path
3+
4+
import yaml
5+
6+
from codeclash import CONFIG_DIR
7+
from codeclash.constants import LOCAL_LOG_DIR
8+
from codeclash.tournaments.pvp import PvpTournament
9+
from codeclash.utils.yaml_utils import resolve_includes
10+
11+
12+
def main(
13+
config_path: Path,
14+
):
15+
yaml_content = config_path.read_text()
16+
preprocessed_yaml = resolve_includes(yaml_content, base_dir=CONFIG_DIR)
17+
config = yaml.safe_load(preprocessed_yaml)
18+
19+
players = config["players"]
20+
num_players = len(players)
21+
for i in range(num_players):
22+
for j in range(i + 1, num_players):
23+
player1 = players[i]
24+
player1["name"] = player1["branch_init"]
25+
player2 = players[j]
26+
player2["name"] = player2["branch_init"]
27+
pvp_config = {
28+
**config,
29+
"players": [player1, player2],
30+
}
31+
vs = f"PvpTournament.{player1['name']}_vs_{player2['name']}".replace("/", "_")
32+
output_dir = LOCAL_LOG_DIR / "ladder" / config["game"]["name"] / vs
33+
try:
34+
tournament = PvpTournament(pvp_config, output_dir=output_dir)
35+
except FileExistsError:
36+
continue
37+
tournament.run()
38+
39+
40+
def main_cli(argv: list[str] | None = None):
41+
parser = argparse.ArgumentParser(description="CodeClash Ladder Runner")
42+
parser.add_argument(
43+
"config_path",
44+
type=Path,
45+
help="Path to the ladder configuration YAML file.",
46+
)
47+
args = parser.parse_args(argv)
48+
main(**vars(args))
49+
50+
51+
if __name__ == "__main__":
52+
main_cli()

0 commit comments

Comments
 (0)