Skip to content

Commit 1ce9a0f

Browse files
authored
[tinker-cookbook] Initial drop of xmux (#138)
1 parent 46b3bbd commit 1ce9a0f

File tree

10 files changed

+2016
-0
lines changed

10 files changed

+2016
-0
lines changed

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@ authors = [
99
requires-python = ">=3.11"
1010
dependencies = [
1111
"chz",
12+
"cloudpickle",
1213
"datasets",
1314
"numpy",
1415
"rich",

tinker_cookbook/xmux/README.md

Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# xmux - TMUX-based Experiment Launcher
2+
3+
xmux is a tool for launching and managing hierarchical ML experiments using TMUX. It provides an interactive control window for monitoring and managing large numbers of concurrent experiments.
4+
5+
## Key Features
6+
7+
- **Hierarchical Organization**: Session = Sweep, with a control window for management
8+
- **Smart Grouping**: Group related experiments in the same window as panes
9+
- **Interactive Control**: Navigate, monitor, and kill experiments from the control window
10+
- **Smart Naming**: Automatic abbreviation of long experiment names
11+
- **Multi-line Status Bar**: Clear overview of all running experiments
12+
13+
## Quick Start
14+
15+
```python
16+
from tinker_cookbook.xmux import JobSpec, SwarmConfig, launch_swarm
17+
18+
# Define your experiments
19+
job_specs = [
20+
JobSpec(
21+
main_fn=train_model, # Your training function
22+
log_relpath="sweep/model1/lr0.001",
23+
entrypoint_config={"model": "bert", "lr": 0.001}
24+
),
25+
# ... more experiments
26+
]
27+
28+
# Launch the swarm
29+
config = SwarmConfig(sweep_name="my-lr-sweep")
30+
launch_swarm(job_specs, config)
31+
```
32+
33+
## Grouping Experiments
34+
35+
You can group related experiments into the same window:
36+
37+
```python
38+
# Group by model type
39+
JobSpec(
40+
main_fn=train_model,
41+
log_relpath="sweep/bert/lr0.001",
42+
entrypoint_config=config,
43+
tmux_window_name="bert", # Groups all BERT experiments
44+
pane_title="lr0.001" # Shows in the pane
45+
)
46+
```
47+
48+
## Using the Control Window
49+
50+
After launching, attach to the TMUX session:
51+
52+
```bash
53+
tmux attach-session -t my-lr-sweep
54+
```
55+
56+
Control window commands:
57+
- **0-9**: Jump to window by number
58+
- **↑↓**: Navigate job list
59+
- **k**: Kill selected job
60+
- **K**: Kill entire window group
61+
- **r**: Refresh status
62+
- **q**: Quit control window
63+
64+
## Adding to an Existing Experiment
65+
66+
If you already have an existing session, you can add
67+
additional jobs to the experiment by using the same
68+
sweep name.
69+
70+
## Examples
71+
72+
See `examples/ml_sweep.py` for complete examples:
73+
74+
```bash
75+
# Run demo with dry-run to see what would happen
76+
python examples/ml_sweep.py 1 --dry-run
77+
78+
# Run actual experiments
79+
python examples/ml_sweep.py 2
80+
81+
# Demo options:
82+
# 1 - Individual windows (no grouping)
83+
# 2 - Grouped by model
84+
# 3 - Mixed grouping strategy
85+
# 4 - Large scale sweep (72 experiments)
86+
```
87+
88+
## Tips
89+
90+
1. **Kill entire sweep**: `tmux kill-session -t sweep-name`
91+
2. **List xmux sessions**: Look for sessions with metadata in `~/experiments/.xmux/`
92+
3. **Window limit**: Use grouping for large sweeps to avoid too many windows
93+
4. **Pane limit**: Set `max_panes_per_window` to control pane overflow

tinker_cookbook/xmux/__init__.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
"""xmux - TMUX-based experiment launcher for ML sweeps"""
2+
3+
from .core import JobSpec, SwarmConfig, launch_swarm
4+
5+
__version__ = "0.1.0"
6+
__all__ = ["JobSpec", "SwarmConfig", "launch_swarm"]

0 commit comments

Comments
 (0)