[Feature] Improve embedder model migration experience

## Problem

Switching the embedding model in OpenViking requires manual, error-prone steps with no validation and no batch tooling. The process overwrites vectors in-place, causing search quality degradation visible to end-users during the migration window. A blue-green migration strategy would eliminate user-visible impact -- search always hits a complete, consistent vector set.

### Current Workflow

```bash
# 1. Admin manually edits ov.conf -- no validation at all
vim ~/.openviking/ov.conf
# modify: embedding.dense.model, dimension, api_base...

# 2. Restart the server -- downtime, no pre-check
docker restart openviking
# or:
openviking-server # starts without any verification

# 3. Reindex resources one by one -- no batch operation
ov reindex viking://resources/doc-a --regenerate --wait
ov reindex viking://resources/doc-b --regenerate --wait
# ... repeat for every resource

# 4. User-visible impact:
# - Search quality degrades silently (old vectors + new model = mismatch)
# - No notification that embedding has changed
# - No way to confirm reindex completion
```

### Pain Points

#### Admin

| Problem | Impact | Evidence |
|---|---|---|
| **No endpoint-level dimension validation** | Config layer has a VectorDB vs Embedding dimension consistency WARNING (`openviking_cli/utils/config/open_viking_config.py`) and auto-syncs -- but it never verifies whether the configured dimension matches the embedding endpoint's actual output | `openviking-server doctor` only checks if API key is set, does not test endpoint connectivity |
| **No bulk reindex** | Must run `ov reindex <uri>` individually for each resource | Current reindex endpoint accepts only a single URI |
| **No model compatibility check** | Config-time model/provider compatibility is not validated. Errors surface only at query time. Example: `provider=openai` + a downstream model that doesn't support matryoshka representations (e.g. certain OpenAI-compatible Qwen endpoints) -- `v0.3.5` passed through the `dimensions` parameter causing a 400 error (#1442, fixed in v0.3.6). Or `provider=litellm` + bare model name (`Qwen3-Embedding-0.6B` instead of `dashscope/Qwen3-Embedding-0.6B`) -- query fails with `LLM Provider NOT provided` | #1442 -- users only discover config errors at search time |
| **No progress visibility** | `ov reindex --wait` blocks with no progress indication | No progress events |
| **No rollback path** | After reindexing with a new model, old vectors are gone | `build_index()` overwrites in-place |

#### User

| Problem | Impact |
|---|---|
| **Search quality degradation during migration** | `reindex` overwrites vectors in-place. During the migration window, query vectors use the new model while some data still has old model vectors -- results are unpredictable |
| **No atomicity** | No all-or-nothing switchover. Users may hit a half-old-half-new vector set |

> Users should always hit a complete, consistent vector set during migration. Blue-green migration (build new vector set in background, then atomic active-pointer switchover) is the straightforward solution.

## Proposed Solution

> **Dependency on #1439**: This proposal builds on #1439 (feat: detect embedding model drift and add rebuild tool), which provides:
> - `embedding_compat.py` -- embedding identity persistence (`embedding_meta.json`), `compatibility_identity()`, and startup-time `ensure_embedding_collection_compatibility()` check
> - `vector_rebuild.py` -- `VectorRebuildService` (`discover_accounts()`, `rebuild_account()`, `rebuild_accounts()`)
> - `openviking-rebuild-vectors` CLI -- `--all-accounts` batch rebuild entry point
> - `EmbeddingCompatibilityError` -- startup fail-fast mechanism
>
> This proposal adds blue-green migration, dual-write, and atomic switchover on top of #1439. If #1439 is not yet merged, Section 4 (health gate) and Section 6 (migration resilience) will need the `embedding_meta.json` persistence part implemented first.

### 1. Extend `ov reindex` with batch support

`ov reindex <URI>` currently accepts only a single URI. When switching embedding models, admins need to reindex every resource -- doing this one at a time is impractical.

```bash
# Current:
ov reindex viking://resources/doc-a --regenerate --wait

# Proposed:

# Reindex all resources
ov reindex --all --regenerate --wait=false

# Reindex with glob pattern
ov reindex viking://resources/my-project/** --regenerate

# Dry-run: show what would be reindexed (count + estimated time)
ov reindex --all --dry-run
# Output:
# Resources to reindex: 47
# Estimated time: ~12 min
# Current embedding model: text-embedding-v4 (1024d)
# WARNING: Dimension mismatch detected -- vectordb rebuild may be required
```

### 2. Extend `ov config validate` with live endpoint check

> `openviking-server doctor` (`openviking_cli/doctor.py`) already checks config syntax, Python version, native engine, AGFS, embedding API key existence, VLM config, and disk space. It does **not** test endpoint connectivity or verify actual embedding dimensions. This proposal adds those checks either to `doctor` or to a new `ov config validate --live` command. The two don't conflict -- `doctor` covers operational health, `--live` covers pre-change validation.

`ov config validate` currently only checks config syntax (JSON schema via serde). Extend it to verify the endpoint is reachable and the output dimension matches config.

```bash
# Current:
ov config validate
# -> only checks JSON schema

# Proposed:
ov config validate --live
# Checks:
# PASS: Config syntax valid
# PASS: Embedding endpoint reachable
# PASS: Embedding dimension matches config (1024d)
# WARNING: Embedding model differs from stored model (text-embedding-v3 -> text-embedding-v4)
# Note: Run `ov reindex --all` to rebuild vectors
```

### 2.1 Config structure for blue-green

During migration, the system needs both the current (active) model config and the target model config. We propose a named `embedding.migration` map in `ov.conf`, where each entry is a named migration target. The existing `embedding` config is implicitly named `default`.

`embedding.migration` is a **map of named configs**, not a single block. This design:
1. Supports read-only `ov.conf` -- migration targets are pre-defined, no runtime writes needed (works with container read-only mounts)
2. Allows multiple migration targets -- admins can pre-configure several models and pick one via CLI
3. Makes `default` implicit -- the existing top-level `embedding` config (dense/sparse/hybrid) is the active profile

**Background**: `EmbeddingConfig` supports three embedding types:
- `dense` -- dense vectors (most common)
- `sparse` -- sparse vectors (BM25-style)
- `hybrid` -- single model returning both dense + sparse

`get_embedder()` logic: if `hybrid` exists, use hybrid embedder; if both `dense` and `sparse` exist, use `CompositeHybridEmbedder`; if only `dense`, use dense embedder.

Migration configs need to cover all three cases. Each migration entry mirrors the model config fields in `embedding`:

**Case A: dense only (most common)**

```json
{
 "embedding": {
 "dense": {
 "provider": "volcengine",
 "model": "doubao-embedding-vision-251215",
 "dimension": 1024,
 "api_base": "https://ark.cn-beijing.volces.com/api/v3",
 "api_key": "..."
 },
 "migration": {
 "openai-v3-large": {
 "dense": {
 "provider": "openai",
 "model": "text-embedding-3-large",
 "dimension": 3072,
 "api_base": "https://api.openai.com/v1",
 "api_key": "..."
 }
 }
 },
 "max_concurrent": 10
 }
}
```

**Case B: dense + sparse (composite hybrid)**

```json
{
 "embedding": {
 "dense": { "provider": "volcengine", "model": "...", "dimension": 1024 },
 "sparse": { "provider": "volcengine", "model": "..." },
 "migration": {
 "openai-mixed": {
 "dense": { "provider": "openai", "model": "...", "dimension": 3072 },
 "sparse": { "provider": "openai", "model": "..." }
 }
 },
 "max_concurrent": 10
 }
}
```

**Case C: hybrid (single-model hybrid)**

```json
{
 "embedding": {
 "hybrid": { "provider": "volcengine", "model": "...", "dimension": 1024 },
 "migration": {
 "openai-hybrid": {
 "hybrid": { "provider": "openai", "model": "...", "dimension": 3072 }
 }
 },
 "max_concurrent": 10
 }
}
```

**Note**: Each migration entry mirrors only `dense`/`sparse`/`hybrid` model config fields. Top-level runtime settings (`max_concurrent`, `circuit_breaker`, `max_retries`, etc.) are global and don't change during migration.

CLI references migration targets by name:

```bash
# List available migration targets
ov reindex --list-targets
# Output:
# Available migration targets:
# - openai-v3-large (openai/text-embedding-3-large, 3072d)
# - qwen-3-large (dashscope/qwen3-embedding, 1024d)

# Start migration with a pre-configured target
ov reindex --all --target openai-v3-large
```

**Lifecycle**:

| Phase | Active profile | Migration state | Behavior |
|---|---|---|---|
| Normal | `default` | None | Single-model operation |
| Migration start | `default` | Target name selected via CLI | Dual-write + bulk re-embed to target |
| Migration complete | Auto-switched to target | Target entry removed from config | Single-model, new config becomes `default` |
| Rollback | Reverted to `default` | Target re-added | Dual-write back to old |

### 3. Blue-green vector migration

Instead of overwriting vectors in-place during reindex, maintain two vector sets ("blue" = current active, "green" = new model being built). Users always query the active set. Once the green set is fully built and verified, atomically promote it to active.

#### Changing model and changing dimension are the same operation

With in-place overwrite, changing the embedding dimension (e.g. 1024d to 3072d) requires dropping and recreating the entire vectordb -- destructive and irreversible. With blue-green, both "new model" and "new dimension" are handled the same way: write to the inactive collection, then flip the pointer. No schema migration, no data loss, no downtime.

#### Migration timeline

```mermaid
flowchart TD
 A["Admin starts reindex with new model"] --> B["Phase 1: Enable dual-write New writes -> default + openai-v3-large Queries -> default"]
 B --> C["Phase 2: Bulk re-embed existing resources default (active): text-embedding-v3, 47 resources openai-v3-large (building): 12/47... Queries -> default Dual-write active"]
 C -->|"openai-v3-large complete"| D["Phase 3: Query switchover Active pointer: default -> openai-v3-large Queries -> openai-v3-large Dual-write still active"]
 D --> E["Phase 4: Disable dual-write Writes -> openai-v3-large only default retained for rollback until TTL expires"]
 E -->|"TTL expires or admin confirms"| F["Phase 5: Delete old set Delete default collection openai-v3-large becomes the new default"]
```

#### API surface

```python
class VectorDB:
 def get_active_collection(self) -> str:
 """Returns collection name for reads -- e.g. 'default' or 'openai-v3-large'."""
 return self.metadata.get("active_embedding_set", "default")

 def is_dual_write_enabled(self) -> bool:
 return self.metadata.get("dual_write", False)

 def switch_active(self, target: str) -> None:
 """Atomic metadata write -- instant switchover for all reads."""
 self.metadata.set("active_embedding_set", target)

 def set_dual_write(self, enabled: bool) -> None:
 self.metadata.set("dual_write", enabled)

 def upsert(self, resource_uri: str, vector) -> None:
 """In dual-write mode, writes to both active and inactive collections."""
 active = self.get_active_collection()
 self._write_to_collection(active, resource_uri, vector)
 if self.is_dual_write_enabled():
 inactive = self._get_inactive_collection()
 self._write_to_collection(inactive, resource_uri, vector)
```

The migration controller orchestrates phases through these primitives:
1. `set_dual_write(True)` -- enable dual-write
2. Loop: `embed_with_new_model()` then `_write_to_collection(green, ...)` -- bulk re-embed to green only
3. `switch_active(green)` -- atomic query switchover
4. `set_dual_write(False)` -- disable dual-write
5. `delete_collection(blue)` -- cleanup old set

#### Rollback

```bash
# If admin detects quality regression after switchover:
ov reindex --rollback
# Instantly switches back to previous set (still on disk)

# Rollback behavior varies by phase:
# Phase 1-2 (dual-write/building): disable dual-write, discard green set
# Phase 3-4 (switched/dual-write-off): flip pointer back to blue, re-enable dual-write briefly
# Phase 5 (cleanup): too late, blue already deleted -- full reindex required
```

### 4. Runtime embedding model health gate

On server startup, validate embedding config against existing vectordb state:

```python
# NOTE: Current vectordb metadata does not store embedding_model name (only dimension).
# This pseudocode assumes #1066/#1439's embedding identity metadata is implemented.
current_model = config.embedding.dense.model
stored_model = vectordb.get_metadata("embedding_model") # requires #1066/#1439
stored_dim = vectordb.get_metadata("embedding_dimension")

if current_model != stored_model or config.embedding.dense.dimension != stored_dim:
 log.warning(
 f"Embedding model changed: {stored_model} ({stored_dim}d) -> {current_model} ({config.embedding.dense.dimension}d). "
 f"Run 'ov reindex --all' to rebuild vectors with blue-green migration."
 )
```

### 5. Config validation on load

When `ov.conf` is loaded, immediately test the embedding endpoint with a trivial input (not on first search):

```python
# Validate on config load, not on first search
try:
 result = await embedder.embed("health check", is_query=True)
 assert len(result.dense_vector) == config.embedding.dense.dimension
except Exception as e:
 raise ConfigError(f"Embedding endpoint validation failed: {e}")
```

### 6. Migration resilience

Migration can take hours. If the server restarts or the CLI disconnects, all progress is lost without persistent state.

> **Building on #1439**: #1439 provides embedding identity persistence (`embedding_meta.json`) and `VectorRebuildService` (per-account delete + reindex). This section extends that foundation:
> - **Migration state** -- #1439's `embedding_meta.json` records the active embedding identity; `migration_state.json` (this proposal) records blue-green switchover progress and phase. They complement each other, no overlap.
> - **Bulk re-embed** -- #1439's `VectorRebuildService.rebuild_account()` does in-place overwrite (delete + rebuild from AGFS). Blue-green replaces that with "write to green set + atomic switchover", but reuses `discover_accounts()` and `_discover_directories()` traversal logic.
> - **Startup check** -- #1439's `ensure_embedding_collection_compatibility()` does fail-fast model drift detection. Blue-green adds interrupted-migration detection (check `migration_state.json` for incomplete migrations).

> **Why not TaskTracker**: The existing `TaskTracker` (`openviking/service/task_tracker.py`) is a pure in-memory registry for short-lived background operations (e.g. session commit). Its design explicitly states "v1 is pure in-memory (no persistence). Tasks are lost on restart." Migration runs for hours and must survive restarts -- TaskTracker is fundamentally unsuitable. Migration state is persisted to disk independently.

> **Design principle**: The five-phase flow (dual-write, then bulk re-embed, then query switchover, then disable dual-write, then cleanup) ensures zero data loss. Dual-write is enabled before bulk re-embed starts, so all writes during the migration window enter both collections.

#### 6.1 Migration state persistence

Migration state is persisted separately from #1439's `embedding_meta.json` -- independent but complementary:

| File | Source | Content | Purpose |
|---|---|---|---|
| `embedding_meta.json` (#1439) | `persist_embedding_metadata()` | Active embedding identity (provider, model, dimension, mode) | Model drift detection at startup (`ensure_embedding_collection_compatibility()`) |
| `migration_state.json` (this proposal) | Migration controller | Blue-green switchover progress (phase, progress, blue/green names) | Incomplete migration detection at startup, supports `--resume` |

```python
# Stored in <workspace>/.meta/migration_state.json
# e.g. ./data/.meta/migration_state.json (workspace defaults to ./data)
{
 "migration_id": "mig_20260417_001",
 "blue_name": "default",
 "green_name": "openai-v3-large",
 "active_name": "default",
 "phase": "building",
 "progress": {
 "total": 47,
 "processed": 12,
 "failed": 0,
 "failed_uris": []
 },
 "started_at": "2026-04-17T10:00:00Z",
 "updated_at": "2026-04-17T10:15:00Z"
}
```

On server startup, check for incomplete migration and offer resume:

```bash
# Server startup detection
WARNING: Incomplete migration detected (mig_20260417_001):
 Phase: building (12/47 resources processed)
 Blue: default, Green: openai-v3-large
 Dual-write: enabled

 Options:
 1. ov reindex --resume # Continue bulk re-embed from checkpoint
 2. ov reindex --abort # Disable dual-write, discard green set
 3. ov reindex --all # Restart from scratch

# If crashed after query switchover:
WARNING: Migration interrupted after query switchover:
 Active: openai-v3-large, Dual-write: was enabled
 Blue set (default) still on disk (for rollback)

 Options:
 1. ov reindex --resume # Disable dual-write, continue to cleanup
 2. ov reindex --rollback # Switch queries back to default
 3. ov reindex --abort # Abort migration entirely
```

#### 6.2 Interrupted migration recovery

When `ov reindex --all` is interrupted (Ctrl+C, network disconnect, OOM):

| Scenario | Behavior |
|---|---|
| Graceful stop (SIGINT) | Finish current resource, save state, exit cleanly |
| Hard crash (OOM, kill -9) | On next startup, detect incomplete state from disk |
| Network disconnect | Embedder retry logic handles transient failures; persistent failure marks resource as failed and continues |
| Crash during dual-write | On restart, re-enable dual-write from persisted state, resume bulk re-embed |

#### 6.3 Partial failure handling

Individual resource embed failures should not abort the entire migration:

```bash
# After migration completes with some failures:
PASS: 45/47 resources migrated successfully
FAIL: 2 resources failed:
 - viking://resources/doc-x (embedder timeout)
 - viking://resources/doc-y (invalid content)

Run `ov reindex --retry-failed` to retry failed resources.
```

#### 6.4 Disk space pre-check

`--dry-run` verifies sufficient disk space for the green collection:

```bash
ov reindex --all --dry-run
# Output:
# Resources to reindex: 47
# Estimated green collection size: ~2.3 GB
# Available disk space: 15.7 GB
# Estimated time: ~12 min
```

#### 6.4.1 Migration state writability check

`--dry-run` also verifies that `migration_state.json` can be written and persisted:

```bash
ov reindex --all --dry-run
# Output:
# Resources to reindex: 47
# Estimated green collection size: ~2.3 GB
# Available disk space: 15.7 GB
# Estimated time: ~12 min
#
# Migration state:
# PASS: State path writable: <workspace>/.meta/migration_state.json
# PASS: Test write successful -- state will survive restarts

# WARNING: container with read-only mount
ov reindex --all --dry-run
# Output:
# ...
# Migration state:
# FAIL: State path NOT writable: <workspace>/.meta/migration_state.json
# Reason: Read-only filesystem
#
# WARNING: If workspace is on a read-only mount,
# migration state will be lost on container restart.
# A full reindex will be required if migration is interrupted.
#
# To fix: ensure workspace directory is on a persistent writable volume
```

#### 6.5 Rollback TTL

Blue collection retention after switchover is configurable:

```json
// ov.conf
{
 "embedding": {
 "migration": {
 "rollback_ttl_hours": 72, // Delete blue set after 72h if not rolled back
 "auto_confirm": false // If true, auto-cleanup after TTL without admin confirmation
 }
 }
}
```

## Alternatives Considered

1. **Status quo (manual, in-place overwrite)** -- works but tedious, and users see search quality degradation during the migration window. Unacceptable. Additionally, changing model or dimension requires manually dropping and recreating the entire vectordb -- a destructive operation.
2. **Automatic reindex on config change** -- too risky. A typo could trigger an expensive reindex. Better to keep it explicit via `ov reindex --all`.
3. **Block search during migration** -- too aggressive. Blue-green achieves zero downtime.
4. **Expose migration state in API responses** -- leaks internal operational details to users. Blue-green keeps migration invisible to users, which is the better approach.

## Expected Outcome

| Who | Before | After |
|---|---|---|
| Admin | Manual config edit, restart without validation, individual reindex, no progress, no rollback | `ov config validate --live`, then `ov reindex --all --dry-run`, then `ov reindex --all` with progress and `--rollback` |
| Admin | Changing model/dimension requires dropping and recreating the entire vectordb | Blue-green: changing model and changing dimension are the same operation |
| Admin | Migration progress lost on interruption, no state after container restart | Migration state persisted to `<workspace>/.meta/`, `--resume` continues from checkpoint |
| User | Search quality degrades during migration (half-old-half-new vectors) | Zero user-visible impact -- search always hits a complete, consistent vector set |

## Related Issues

- #1439 -- [Open PR] feat: detect embedding model drift and add rebuild tool. This proposal's prerequisite: provides `embedding_meta.json` persistence, `compatibility_identity()`, `VectorRebuildService`, `openviking-rebuild-vectors` CLI.
- #1066 -- Feature: auto-detect embedding model change and trigger vector rebuild (closed by #1439)
- #1442 -- [Closed, fixed in v0.3.6] [Bug]: 0.3.5 using qwen embedding model error
- #1073 -- [Feature] reindex should support recursive directory traversal (overlaps with `--all` proposal)
- #857 -- Feature request: make text file vectorization strategy configurable to avoid embedding oversize failures

## Feature Area

Configuration / CLI / Server Runtime

## Contribution

- [x] I am willing to contribute to implementing this feature

Problem	Impact	Evidence
No endpoint-level dimension validation	Config layer has a VectorDB vs Embedding dimension consistency WARNING (`openviking_cli/utils/config/open_viking_config.py`) and auto-syncs -- but it never verifies whether the configured dimension matches the embedding endpoint's actual output	`openviking-server doctor` only checks if API key is set, does not test endpoint connectivity
No bulk reindex	Must run `ov reindex <uri>` individually for each resource	Current reindex endpoint accepts only a single URI
No model compatibility check	Config-time model/provider compatibility is not validated. Errors surface only at query time. Example: `provider=openai` + a downstream model that doesn't support matryoshka representations (e.g. certain OpenAI-compatible Qwen endpoints) -- `v0.3.5` passed through the `dimensions` parameter causing a 400 error (#1442, fixed in v0.3.6). Or `provider=litellm` + bare model name (`Qwen3-Embedding-0.6B` instead of `dashscope/Qwen3-Embedding-0.6B`) -- query fails with `LLM Provider NOT provided`	#1442 -- users only discover config errors at search time
No progress visibility	`ov reindex --wait` blocks with no progress indication	No progress events
No rollback path	After reindexing with a new model, old vectors are gone	`build_index()` overwrites in-place

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Improve embedder model migration experience #1523

Problem

Current Workflow

Pain Points

Admin

User

Proposed Solution

1. Extend `ov reindex` with batch support

2. Extend `ov config validate` with live endpoint check

2.1 Config structure for blue-green

3. Blue-green vector migration

Changing model and changing dimension are the same operation

Migration timeline

API surface

Rollback

4. Runtime embedding model health gate

5. Config validation on load

6. Migration resilience

6.1 Migration state persistence

6.2 Interrupted migration recovery

6.3 Partial failure handling

6.4 Disk space pre-check

6.4.1 Migration state writability check

6.5 Rollback TTL

Alternatives Considered

Expected Outcome

Related Issues

Feature Area

Contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Problem	Impact
Search quality degradation during migration	`reindex` overwrites vectors in-place. During the migration window, query vectors use the new model while some data still has old model vectors -- results are unpredictable
No atomicity	No all-or-nothing switchover. Users may hit a half-old-half-new vector set

Phase	Active profile	Migration state	Behavior
Normal	`default`	None	Single-model operation
Migration start	`default`	Target name selected via CLI	Dual-write + bulk re-embed to target
Migration complete	Auto-switched to target	Target entry removed from config	Single-model, new config becomes `default`
Rollback	Reverted to `default`	Target re-added	Dual-write back to old

File	Source	Content	Purpose
`embedding_meta.json` (#1439)	`persist_embedding_metadata()`	Active embedding identity (provider, model, dimension, mode)	Model drift detection at startup (`ensure_embedding_collection_compatibility()`)
`migration_state.json` (this proposal)	Migration controller	Blue-green switchover progress (phase, progress, blue/green names)	Incomplete migration detection at startup, supports `--resume`

Scenario	Behavior
Graceful stop (SIGINT)	Finish current resource, save state, exit cleanly
Hard crash (OOM, kill -9)	On next startup, detect incomplete state from disk
Network disconnect	Embedder retry logic handles transient failures; persistent failure marks resource as failed and continues
Crash during dual-write	On restart, re-enable dual-write from persisted state, resume bulk re-embed

Who	Before	After
Admin	Manual config edit, restart without validation, individual reindex, no progress, no rollback	`ov config validate --live`, then `ov reindex --all --dry-run`, then `ov reindex --all` with progress and `--rollback`
Admin	Changing model/dimension requires dropping and recreating the entire vectordb	Blue-green: changing model and changing dimension are the same operation
Admin	Migration progress lost on interruption, no state after container restart	Migration state persisted to `<workspace>/.meta/`, `--resume` continues from checkpoint
User	Search quality degrades during migration (half-old-half-new vectors)	Zero user-visible impact -- search always hits a complete, consistent vector set

[Feature] Improve embedder model migration experience #1523

Description

Problem

Current Workflow

Pain Points

Admin

User

Proposed Solution

1. Extend ov reindex with batch support

2. Extend ov config validate with live endpoint check

2.1 Config structure for blue-green

3. Blue-green vector migration

Changing model and changing dimension are the same operation

Migration timeline

API surface

Rollback

4. Runtime embedding model health gate

5. Config validation on load

6. Migration resilience

6.1 Migration state persistence

6.2 Interrupted migration recovery

6.3 Partial failure handling

6.4 Disk space pre-check

6.4.1 Migration state writability check

6.5 Rollback TTL

Alternatives Considered

Expected Outcome

Related Issues

Feature Area

Contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

1. Extend `ov reindex` with batch support

2. Extend `ov config validate` with live endpoint check