This project develops a standalone SubRep implementation that transforms skill discovery into a certificate-driven, auditable process. SubRep certifies skills via two mathematical tests (CDS/PDS) that guarantee composition safety across motive shifts, preventing negative transfer before skills enter the library.
This project validates the core mechanism in MO-LunarLander, storing certified skills as native MeTTa Atoms for future Hyperon integration.
Aligned with Approved Quarter Plan:
| Objective | Goal | Key Results |
|---|---|---|
| 1. Neural Skill Generator | Generate skill summaries from experience | • 2-head MLP (Payoff + Motives) • MDN Interface Defined • TD Error Computation |
| 2. Core Certification | Implement CDS/PDS admission tests | • CDS Test (Universal Benefit) • PDS-ε Test (Acceptable Trade-off) • MO-LunarLander Integration |
| 3. MeTTa Storage | Store certificates as native Atoms | • Certificate Schema Defined • PyMeTTa Bridge ( hyperon)• Zero-Shot Reuse Demo |
| 4. Minimal Validation | Demonstrate core mechanism works | • Certified Skills Pass Tests • Uncertified Skills Rejected • Admission Rates Documented |
- Python 3.8+
- Git
# Clone the repository
git clone https://github.com/iCog-Labs-Dev/subrep.git
cd subrep
# Install dependencies
pip install -r requirements.txt
# Configure environment
cp .env.example .env# Verify Environment Setup
python tests/test_env.py
# Verify Generator Output
python tests/test_generator.py
# Run Full Pipeline (Phase 5+)
python main.py| Folder | Description |
|---|---|
env/ |
MO-LunarLander wrapper & vector reward handling |
generator/ |
2-head MLP skill generator (PyTorch) |
certification/ |
CDS/PDS admission gate logic |
metta/ |
PyMeTTa bridge & certificate schema |
utils/ |
TD error computation, logging, helpers |
tests/ |
Validation scripts for each component |
- Platform:
mo-gymnasium(MO-LunarLander-v3) - Observation Space:
(8,)– State vector (position, velocity, fuel, etc.) - Reward Space:
(2,)–[Safety_Reward, Fuel_Reward]
- Architecture: 2-head MLP (Payoff + Motives)
- Input: State vector
(8,) - Output:
payoff: Scalar(1,)motives: Vector(2,)
- CDS: Cone-Dominant Subtask (Universal Benefit)
- PDS-ε: Pareto-Dominant Subtask (Acceptable Trade-off)
- Cones: Full-simplex (Phase 3) -> MDN-learned (Phase 4+)
- Package:
hyperon(Python bindings) - Operations:
add_atom,match,space
- MDN Training: Full Motive Decomposition Network implementation.
- MetaMo Integration: Dynamic weight management & risk budgets.
- Cross-Paradigm Skills: Logic macros & evolutionary programs.
- Benchmarking: Hypervolume efficiency vs. standard MORL baselines.