Dynamic Routing for vLLM #800

terrykong · 2025-03-31T20:15:12Z

terrykong
Mar 31, 2025
Maintainer

We need a router sidecar that queries the telemetry from vllm and determine which vLLM instance has the least load and send more prompts there.

terrykong · 2025-03-31T21:23:26Z

terrykong
Mar 31, 2025
Maintainer Author

Example router from dynamo: https://github.com/ai-dynamo/dynamo/blob/main/examples/llm/components/kv_router.py

0 replies

dchichkov · 2025-04-01T00:47:45Z

dchichkov
Apr 1, 2025

There's this repo - https://github.com/VectorInstitute/vector-inference

This repository provides an easy-to-use solution to run inference servers on Slurm-managed computing clusters using vLLM.

Note also:
https://docs.vllm.ai/projects/production-stack/en/latest/user_manual/router/cmd.html

Note, slurm autoscale, with an option to backfill slurm capacity is the next obvious feature.

0 replies

euronymous-aithal · 2025-09-26T04:11:44Z

euronymous-aithal
Sep 26, 2025
Maintainer

@terrykong this is the updated standalone Router : https://github.com/ai-dynamo/dynamo/tree/main/examples/deployments/router_standalone

0 replies

terrykong · 2025-09-26T04:39:53Z

terrykong
Sep 26, 2025
Maintainer Author

thanks for the pointer @euronymous-aithal . converting this discussion to an issue.

#1210

please continue discussion on 1210

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamic Routing for vLLM #800

Uh oh!

{{title}}

Uh oh!

Replies: 4 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Dynamic Routing for vLLM #800

Uh oh!

terrykong Mar 31, 2025 Maintainer

Replies: 4 comments

Uh oh!

terrykong Mar 31, 2025 Maintainer Author

Uh oh!

dchichkov Apr 1, 2025

Uh oh!

euronymous-aithal Sep 26, 2025 Maintainer

Uh oh!

terrykong Sep 26, 2025 Maintainer Author

terrykong
Mar 31, 2025
Maintainer

terrykong
Mar 31, 2025
Maintainer Author

dchichkov
Apr 1, 2025

euronymous-aithal
Sep 26, 2025
Maintainer

terrykong
Sep 26, 2025
Maintainer Author