Skip to content

Commit e21747d

Browse files
authored
chore: sgl version bump and dp attention yaml fix (#1360)
1 parent d250018 commit e21747d

File tree

2 files changed

+4
-3
lines changed

2 files changed

+4
-3
lines changed

container/Dockerfile.sglang

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -136,8 +136,9 @@ RUN if [ "$ARCH" = "arm64" ]; then \
136136

137137
# Install sglang
138138
# Once either 0.4.6post6 or 0.4.7 is released, we can switch back to using the published version
139-
# This commit references a fix for DP attention and NIXL https://github.com/sgl-project/sglang/pull/6473
140-
ARG SGLANG_COMMIT="e806f708c954020bda7d1cc98035a44fd6a4eb96"
139+
# This commit references multiple perf fixes for DP attention and NIXL https://github.com/sgl-project/sglang/pull/6780
140+
# 6/2(ishan) - moving to ToT for performance purposes
141+
ARG SGLANG_COMMIT="6376b632eb4daef306b89ede0eabdcb89ddff728"
141142
RUN --mount=type=cache,target=/root/.cache/uv \
142143
git clone https://github.com/sgl-project/sglang.git && \
143144
cd sglang && \

examples/sglang/configs/disagg-dp-attention.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
# limitations under the License.
1515

1616
Frontend:
17-
served_model_name: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
17+
served_model_name: silence09/DeepSeek-R1-Small-2layers
1818
endpoint: dynamo.SGLangWorker.generate
1919
port: 8000
2020

0 commit comments

Comments
 (0)