Skip to content

Commit 6c624f2

Browse files
saturley-hallathreeshharryskimclaudeakshatha-k
authored
docs: address Harry/VDR feedback + fixing broken links across repository (#3802) (#3841)
Signed-off-by: Harry Kim <[email protected]> Signed-off-by: athreesh <[email protected]> Signed-off-by: akshatha-k <[email protected]> Signed-off-by: Harrison Saturley-Hall <[email protected]> Signed-off-by: Harrison King Saturley-Hall <[email protected]> Co-authored-by: Anish <[email protected]> Co-authored-by: Harry Kim <[email protected]> Co-authored-by: Claude <[email protected]> Co-authored-by: akshatha-k <[email protected]>
1 parent a22cf24 commit 6c624f2

File tree

57 files changed

+689
-397
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

57 files changed

+689
-397
lines changed

README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ limitations under the License.
2222
[![Discord](https://dcbadge.limes.pink/api/server/D92uqZRjCZ?style=flat)](https://discord.gg/D92uqZRjCZ)
2323
[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/ai-dynamo/dynamo)
2424

25-
| **[Roadmap](https://github.com/ai-dynamo/dynamo/issues/762)** | **[Support matrix](https://github.com/ai-dynamo/dynamo/blob/main/docs/support_matrix.md)** | **[Documentation](https://docs.nvidia.com/dynamo/latest/index.html)** | **[Examples](https://github.com/ai-dynamo/dynamo/tree/main/examples)** | **[Prebuilt containers](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo)** | **[Design Proposals](https://github.com/ai-dynamo/enhancements)** | **[Blogs](https://developer.nvidia.com/blog/tag/nvidia-dynamo)**
25+
| **[Roadmap](https://github.com/ai-dynamo/dynamo/issues/2486)** | **[Support matrix](https://github.com/ai-dynamo/dynamo/blob/main/docs/reference/support-matrix.md)** | **[Documentation](https://docs.nvidia.com/dynamo/latest/index.html)** | **[Examples](https://github.com/ai-dynamo/dynamo/tree/main/examples)** | **[Prebuilt containers](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-dynamo/collections/ai-dynamo)** | **[Design Proposals](https://github.com/ai-dynamo/enhancements)** | **[Blogs](https://developer.nvidia.com/blog/tag/nvidia-dynamo)**
2626

2727
# NVIDIA Dynamo
2828

@@ -56,9 +56,9 @@ Dynamo is designed to be inference engine agnostic (supports TRT-LLM, vLLM, SGLa
5656

5757
| Feature | vLLM | SGLang | TensorRT-LLM |
5858
| ------------------------------------------------------------------------------------------------- | ---- | ------ | ------------ |
59-
| [**Disaggregated Serving**](/docs/architecture/disagg_serving.md) ||||
60-
| [**Conditional Disaggregation**](/docs/architecture/disagg_serving.md#conditional-disaggregation) | 🚧 | 🚧 | 🚧 |
61-
| [**KV-Aware Routing**](/docs/architecture/kv_cache_routing.md) ||||
59+
| [**Disaggregated Serving**](/docs/design_docs/disagg_serving.md) ||||
60+
| [**Conditional Disaggregation**](/docs/design_docs/disagg_serving.md#conditional-disaggregation) | 🚧 | 🚧 | 🚧 |
61+
| [**KV-Aware Routing**](/docs/router/kv_cache_routing.md) ||||
6262
| [**Load Based Planner**](docs/planner/load_planner.md) | 🚧 | 🚧 | 🚧 |
6363
| [**SLA-Based Planner**](docs/planner/sla_planner.md) ||||
6464
| [**KVBM**](docs/kvbm/kvbm_architecture.md) || 🚧 ||

benchmarks/router/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,7 +116,7 @@ To see all available router arguments, run:
116116
python -m dynamo.frontend --help
117117
```
118118

119-
For detailed explanations of router arguments (especially KV cache routing parameters), see the [KV Cache Routing documentation](../../docs/architecture/kv_cache_routing.md).
119+
For detailed explanations of router arguments (especially KV cache routing parameters), see the [KV Cache Routing documentation](../../docs/router/kv_cache_routing.md).
120120

121121
#### Launching a Standalone Router for Prefill Workers (Optional)
122122

components/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ Each engine provides launch scripts for different deployment patterns in their r
3131

3232
## Core Components
3333

34-
### [Backends](src/dynamo/)
34+
### [Backends](backends/)
3535

3636
The backends directory contains inference engine integrations and implementations, with a key focus on:
3737

components/backends/sglang/deploy/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -144,7 +144,7 @@ All templates use **DeepSeek-R1-Distill-Llama-8B** as the default model. But you
144144

145145
## Further Reading
146146

147-
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
147+
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/deployment/create_deployment.md)
148148
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
149149
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
150150
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)

components/backends/trtllm/deploy/README.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -153,7 +153,7 @@ args:
153153

154154
### 3. Deploy
155155

156-
See the [Create Deployment Guide](../../../../docs/kubernetes/create_deployment.md) to learn how to deploy the deployment file.
156+
See the [Create Deployment Guide](../../../../docs/kubernetes/deployment/create_deployment.md) to learn how to deploy the deployment file.
157157

158158
First, create a secret for the HuggingFace token.
159159
```bash
@@ -258,7 +258,7 @@ For detailed configuration instructions, see the [KV cache transfer guide](../..
258258

259259
## Request Migration
260260

261-
You can enable [request migration](../../../../docs/architecture/request_migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
261+
You can enable [request migration](../../../../docs/fault_tolerance/request_migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
262262

263263
```yaml
264264
args:
@@ -277,11 +277,11 @@ Configure the `model` name and `host` based on your deployment.
277277

278278
## Further Reading
279279

280-
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
280+
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/deployment/create_deployment.md)
281281
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
282282
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
283283
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
284-
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
284+
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/design_docs/disagg_serving.md), [KV-Aware Routing](../../../../docs/router/kv_cache_routing.md)
285285
- **Multinode Deployment**: [Multinode Examples](../../../../docs/backends/trtllm/multinode/multinode-examples.md)
286286
- **Speculative Decoding**: [Llama 4 + Eagle Guide](../../../../docs/backends/trtllm/llama4_plus_eagle.md)
287287
- **Kubernetes CRDs**: [Custom Resources Documentation](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/)

components/backends/vllm/deploy/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -224,7 +224,7 @@ All templates use **Qwen/Qwen3-0.6B** as the default model, but you can use any
224224

225225
## Request Migration
226226

227-
You can enable [request migration](../../../../docs/architecture/request_migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
227+
You can enable [request migration](../../../../docs/fault_tolerance/request_migration.md) to handle worker failures gracefully by adding the migration limit argument to worker configurations:
228228

229229
```yaml
230230
args:
@@ -234,12 +234,12 @@ args:
234234
235235
## Further Reading
236236
237-
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/create_deployment.md)
237+
- **Deployment Guide**: [Creating Kubernetes Deployments](../../../../docs/kubernetes/deployment/create_deployment.md)
238238
- **Quickstart**: [Deployment Quickstart](../../../../docs/kubernetes/README.md)
239239
- **Platform Setup**: [Dynamo Cloud Installation](../../../../docs/kubernetes/installation_guide.md)
240240
- **SLA Planner**: [SLA Planner Quickstart Guide](../../../../docs/planner/sla_planner_quickstart.md)
241241
- **Examples**: [Deployment Examples](../../../../docs/examples/README.md)
242-
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/architecture/disagg_serving.md), [KV-Aware Routing](../../../../docs/architecture/kv_cache_routing.md)
242+
- **Architecture Docs**: [Disaggregated Serving](../../../../docs/design_docs/disagg_serving.md), [KV-Aware Routing](../../../../docs/router/kv_cache_routing.md)
243243
244244
## Troubleshooting
245245

components/src/dynamo/router/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33

44
# Standalone Router
55

6-
A backend-agnostic standalone KV-aware router service for Dynamo deployments. For details on how KV-aware routing works, see the [KV Cache Routing documentation](/docs/architecture/kv_cache_routing.md).
6+
A backend-agnostic standalone KV-aware router service for Dynamo deployments. For details on how KV-aware routing works, see the [KV Cache Routing documentation](/docs/router/kv_cache_routing.md).
77

88
## Overview
99

@@ -29,7 +29,7 @@ python -m dynamo.router \
2929
- `--endpoint`: Full endpoint path for workers in the format `namespace.component.endpoint` (e.g., `dynamo.prefill.generate`)
3030

3131
**Router Configuration:**
32-
For detailed descriptions of all KV router configuration options including `--block-size`, `--kv-overlap-score-weight`, `--router-temperature`, `--no-kv-events`, `--router-replica-sync`, `--router-snapshot-threshold`, `--router-reset-states`, and `--no-track-active-blocks`, see the [KV Cache Routing documentation](/docs/architecture/kv_cache_routing.md).
32+
For detailed descriptions of all KV router configuration options including `--block-size`, `--kv-overlap-score-weight`, `--router-temperature`, `--no-kv-events`, `--router-replica-sync`, `--router-snapshot-threshold`, `--router-reset-states`, and `--no-track-active-blocks`, see the [KV Cache Routing documentation](/docs/router/kv_cache_routing.md).
3333

3434
## Architecture
3535

@@ -98,6 +98,6 @@ See [`components/src/dynamo/vllm/handlers.py`](../vllm/handlers.py) for a refere
9898

9999
## See Also
100100

101-
- [KV Cache Routing Architecture](/docs/architecture/kv_cache_routing.md) - Detailed explanation of KV-aware routing
101+
- [KV Cache Routing Architecture](/docs/router/kv_cache_routing.md) - Detailed explanation of KV-aware routing
102102
- [Frontend Router](../frontend/README.md) - Main HTTP frontend with integrated routing
103103
- [Router Benchmarking](/benchmarks/router/README.md) - Performance testing and tuning

deploy/cloud/pre-deployment/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ This directory contains a pre-deployment check script that verifies your Kuberne
2121

2222
- For NCCL tests, please refer to the [NCCL tests](https://docs.nebius.com/kubernetes/gpu/nccl-test#run-tests) for more details.
2323

24-
- For NIXL benchmark, please refer to the [NIXL benchmark pre-deployment checks](/deploy/cloud/pre-deployment/nixl/README.md) for more details.
24+
For the latest pre-deployment check instructions, see the [main branch version of this README](https://github.com/ai-dynamo/dynamo/blob/main/deploy/cloud/pre-deployment/README.md).
2525

2626
## Usage
2727

deploy/inference-gateway/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Currently, these setups are only supported with the kGateway based Inference Gat
1313

1414
- [Prerequisites](#prerequisites)
1515
- [Installation Steps](#installation-steps)
16-
- [Usage](#usage)
16+
- [Usage](#6-usage)
1717

1818
## Prerequisites
1919

@@ -200,7 +200,7 @@ You can configure the plugin by setting environment vars in your [values-epp-awa
200200
- Set `DYNAMO_OVERLAP_SCORE_WEIGHT` to weigh how heavily the score uses token overlap (predicted KV cache hits) versus other factors (load, historical hit rate). Higher weight biases toward reusing workers with similar cached prefixes.
201201
- Set `DYNAMO_ROUTER_TEMPERATURE` to soften or sharpen the selection curve when combining scores. Low temperature makes the router pick the top candidate deterministically; higher temperature lets lower-scoring workers through more often (exploration).
202202
- Set `DYNAMO_USE_KV_EVENTS=false` if you want to disable KV event tracking while using kv-routing
203-
- See the [KV cache routing design](../../docs/architecture/kv_cache_routing.md) for details.
203+
- See the [KV cache routing design](../../docs/router/kv_cache_routing.md) for details.
204204

205205

206206
```bash

deploy/logging/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
# Dynamo Logging on Kubernetes
22

3-
For detailed documentation on collecting and visualizing logs on Kubernetes, see [docs/kubernetes/logging.md](../../docs/kubernetes/logging.md).
3+
For detailed documentation on collecting and visualizing logs on Kubernetes, see [docs/kubernetes/observability/logging.md](../../docs/kubernetes/observability/logging.md).

0 commit comments

Comments
 (0)