perf: compress dynamic SSR and API responses#2407
Conversation
adapter-node serves pre-compressed static assets via sirv but streams SSR HTML and API JSON uncompressed. Wrap the handler in a small server that negotiates brotli/gzip for dynamic responses. Streaming endpoints (application/jsonl token stream, text/event-stream SSE) are excluded from compression so tokens keep flowing chunk-by-chunk. Measured on the built app: GET / went from 198,427 to 23,528 bytes (-88%).
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: dce291c7c4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| }); | ||
|
|
||
| const shutdown = () => { | ||
| server.close(() => process.exit(0)); |
There was a problem hiding this comment.
Avoid exiting before shutdown hooks finish
When SIGTERM/SIGINT arrives while there are no active HTTP requests, server.close can invoke this callback immediately, and process.exit(0) terminates the process before the app's existing async shutdown hooks registered in src/lib/server/exitHandler.ts can finish (for example closing MongoDB in src/lib/server/database.ts and the metrics server in src/lib/server/metrics.ts). The adapter-node server this replaces did not call process.exit in its close callback, so normal container shutdowns now risk skipping cleanup; let the event loop drain or coordinate with the existing exit handler instead.
Useful? React with 👍 / 👎.
|
Superseded by #2409, which lands the same compression approach with a mime filter for both streaming content types (application/jsonl and text/event-stream), a Vary merge that survives SvelteKit's own header writes, brotli level pinned to 4, unix-socket support, and graceful shutdown. Verified end to end against a production build. |
What
Adds a thin production server (
server.mjs) that wraps the adapter-node handler withcompression, so dynamic responses (SSR HTML,/api/v2/*JSON) are served with brotli/gzip. Static assets were already pre-compressed at build time via sirv, but everything rendered at runtime went out raw.Measured against hf.co/chat in production: the HTML document is 318-355 KB with no
content-encodingon every page view, and every conversation-switch JSON payload (20-70 KB) is also uncompressed.Why streaming is excluded
The chat token stream (
POST /conversation/[id]) responds withContent-Type: application/jsonl, and/api/v2/conversations/updatesusestext/event-stream. A compression middleware buffers output, which would silently destroy time-to-first-token. Both content types are explicitly skipped, so token streaming behavior is unchanged.Deployment
entrypoint.shnow startsnode /app/server.mjsinstead ofbuild/index.js(PORT/HOST/SHUTDOWN_TIMEOUT env vars behave the same, body-size limiting stays inside the adapter handler)Dockerfilecopiesserver.mjsinto the imagenpm run dev/vite previeware unaffectedVerification
application/jsonlandtext/event-streampass through unencoded and unbuffered (first chunk arrives at ~100 ms of a ~500 ms stream, 5 distinct chunks received)server.mjs:GET /chat/went from 198,427 to 23,528 bytes transferred (-88%),Content-Encoding: br,Vary: Accept-Encodingsetnpm run check,npm run lint,npm test(392 tests) all pass