Skip to content
This repository was archived by the owner on Oct 15, 2025. It is now read-only.

Conversation

@namasl
Copy link
Contributor

@namasl namasl commented Jun 26, 2025

This PR addresses #345, adding startup, liveness, and readiness probes to prefill and decode pods.

The startup probes give a lot of leeway to try to cover small models which load quickly, and large models with long load times (values here allow up to 30 minutes).

@nerdalert
Copy link
Member

@namasl this looks great. I tested without any issues. Can you bump the chart version to v1.0.21 in https://github.com/llm-d/llm-d-deployer/blob/main/charts/llm-d/Chart.yaml and re-run pre-commit run -a.

cc/ @Gregory-Pereira

@nerdalert
Copy link
Member

@namasl I apologize this hasn't merged yet. I really like this feature. Could you bump https://github.com/llm-d/llm-d-deployer/blob/main/charts/llm-d/Chart.yaml one more time to version: 1.0.22 and I will merge it 🙏

@nerdalert
Copy link
Member

nerdalert commented Jul 8, 2025

@namasl mind rebasing and pre-commit run -a as well please, ty!

Copy link
Member

@nerdalert nerdalert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ty for the feature.

@nerdalert nerdalert merged commit a51e9ca into llm-d:main Jul 8, 2025
3 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants