Skip to content

Conversation

@Muhtasham
Copy link
Contributor

No description provided.

@john-b-yang
Copy link
Contributor

@Muhtasham this is fantastic - thank u so much.

An immediate feedback point: For CodeClash, the arena engine + code (what you put under bridge/examples and bridge/game_server is typically placed in a different repository (for instance, RobotRumble)

Can you do the same for Bridge? We want to keep the arena code separate from the repository code for better maintainability + enable customization more easily on a per-arena basis.

Let me know if you have any questions here. I'd reference an existing .Dockerfile to also see what updates are needed. Some random notes:

  • Put any arena/game-specific code into a separate repository
  • Your .Dockerfile should clone the github repository
  • Your codeclash/arenas/bridge/bridge.py looks great, I didn't take a close look, but if there's a need, make sure it's grounded in the paths of the new, separate repository.

Once you do these fixes, let me know and I'll give the arena a couple runs and see how it goes.

Great initial work! 🙏🏼

Muhtasham and others added 3 commits December 11, 2025 21:23
- Update Dockerfile to clone from https://github.com/CodeClash-ai/Bridge
- Remove game_server/ and examples/ (now in separate repo)
- Update bridge.py path references
- Refactor bridge.py to use run_game.py runner script (like RobotRumble)
- Add example config Bridge__claude-3-5-haiku__r2__s10.yaml
- Games now execute properly with correct scoring
@john-b-yang
Copy link
Contributor

Ok implementation is fantastic, thank you so much @Muhtasham!

I made one update to the original repository, I moved examples/random_agent.py to just be the default bridge_agent.py. For all of the arenas, one of the things we ensured in the original paper is that starter codebase contains an already-functioning implementation of the bot. This is a bit for "fairness" purposes (we give the LM a working implementation, if it breaks it, that's the model's fault), and also empirical to enable investigation of relatively "interesting" behaviors ("uninteresting" = the model just fumbles around trying to figure out how to put together an initial bot for multiple rounds, although this could be interesting to investigate).

I think this PR is ready to merge! Just approved - will let you take a last pass, and then whenever you feel comfortable, feel free to merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants