Skip to content

Feature Request: Add Support for DeepMind’s DiscoRL (Discovering RL Algorithms) #6255

@zgx949

Description

@zgx949

Description

DeepMind recently released DiscoRL, a new framework for discovering and generalizing reinforcement learning algorithms automatically.
It aims to produce general-purpose RL algorithms that can adapt across different tasks and environments, rather than being tailored to a single setup.

Reference:
🔗 https://deepwiki.com/google-deepmind/disco_rl/5-contributing
📄 Original Paper(Nature2025)
👉Github


Motivation

Currently, ML-Agents provides several built-in RL algorithms (PPO, SAC, etc.), but all are manually designed.
Integrating or experimenting with DiscoRL could:

  • Enable meta-RL or algorithm discovery within Unity environments.
  • Provide a new research direction for users exploring automated RL.
  • Potentially lead to more general, data-efficient learning agents.

This aligns with ML-Agents’ goal of being a flexible platform for both game AI and reinforcement learning research.


Proposed Implementation Ideas

  • Add a new trainer module under mlagents/trainers/disco_rl/ following the pattern of existing algorithms (PPO, SAC).
  • Allow users to toggle DiscoRL mode in trainer_config.yaml (e.g. trainer_type: disco_rl).
  • Provide a simple Unity demo environment (like GridWorld or Walker) to test its behavior.

Possible Challenges

  • DiscoRL is still experimental and may require adapting its meta-learning infrastructure.
  • Integration would depend on PyTorch version compatibility and reproducibility.
  • May require additional compute or environment abstractions for “algorithm search.”

Additional Context

If accepted, I’d be happy to help draft an initial implementation plan or contribute a prototype to explore feasibility.

Metadata

Metadata

Assignees

No one assigned

    Labels

    requestIssue contains a feature request.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions