Add request/response body size tracking to trace events #5925

steebchen · 2026-01-17T17:19:54Z

Summary

Add requestSize and bodySize fields to FetchEventInfo and FetchResponseInfo in workerd trace events. Request size is captured from the Content-Length header, and response size from the expectedBodySize field. These sizes are now accessible to tail workers via trace event properties.

Changes

Cap'n Proto: Added requestSize (UInt64) to FetchEventInfo and bodySize (UInt64) to FetchResponseInfo
C++ structs: Updated FetchEventInfo and FetchResponseInfo with new fields and constructors
Worker entrypoint: Track expectedBodySize in ResponseSentTracker and extract Content-Length for request size
JavaScript API: Expose bodySize as optional properties on request and response trace objects

🤖 Generated with Claude Code

github-actions · 2026-01-17T17:20:04Z

All contributors have signed the CLA ✍️ ✅
_{Posted by the CLA Assistant Lite bot.}

Copilot

Pull request overview

This pull request adds request and response body size tracking to workerd trace events. The sizes are captured from the Content-Length header for requests and from the expectedBodySize field for responses, making them accessible to tail workers.

Changes:

Added requestSize field to FetchEventInfo and bodySize field to FetchResponseInfo in Cap'n Proto schema
Implemented Content-Length header parsing to extract request body size
Modified ResponseSentTracker to capture and report response body size
Exposed body sizes as optional JavaScript properties on trace request/response objects

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/workerd/io/worker-interface.capnp	Added requestSize (UInt64) to FetchEventInfo and bodySize (UInt64) to FetchResponseInfo schema definitions
src/workerd/io/trace.h	Updated FetchEventInfo and FetchResponseInfo struct declarations with new size fields and constructor parameters
src/workerd/io/trace.c++	Implemented constructors, copy, clone, and reader methods for new size fields
src/workerd/io/worker-entrypoint.c++	Added Content-Length parsing logic and ResponseSentTracker modifications to capture body sizes
src/workerd/api/trace.h	Added bodySize getter methods and internal storage to Request and Response trace API classes
src/workerd/api/trace.c++	Implemented getBodySize methods that return null for zero/unknown sizes, otherwise return size as double

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-17T17:23:58Z

src/workerd/io/worker-entrypoint.c++

+      char* end;
+      auto parsed = strtoull(contentLength.cStr(), &end, 10);
+      if (end != contentLength.cStr() && *end == '\0') {
+        requestSize = parsed;
+      }


The parsing validation check has a subtle issue. The condition end != contentLength.cStr() will be true even if parsing partially fails (e.g., for "123abc", it would parse "123" and set end to point to 'a').

The validation should check that ALL characters were consumed, not just that SOME characters were consumed. The condition should be:

end != contentLength.cStr() ensures at least one character was parsed

*end == '\0' ensures we reached the end

However, this still allows leading whitespace and '+' signs which strtoull accepts but may not be valid for Content-Length per HTTP specs. Additionally, strtoull returns ULLONG_MAX on overflow and sets errno to ERANGE, which is not checked.

Consider using a stricter validation that only allows digits, or at minimum check errno for overflow.

Copilot · 2026-01-17T17:23:58Z

src/workerd/api/trace.c++

+  if (detail->requestSize == 0) {
+    return kj::none;
+  }
+  return static_cast<double>(detail->requestSize);


Converting uint64_t to double can lose precision for values larger than 2^53 (approximately 9 quadrillion bytes or 9 petabytes). While extremely unlikely in practice for HTTP body sizes, this could theoretically cause inaccurate body size reporting for very large bodies.

JavaScript's Number type uses IEEE 754 double precision, which can only accurately represent integers up to Number.MAX_SAFE_INTEGER (2^53 - 1). Beyond this, precision is lost. Consider documenting this limitation or using BigInt if precise representation of very large sizes is required.

Copilot · 2026-01-17T17:23:58Z

src/workerd/api/trace.c++

+  if (bodySize == 0) {
+    return kj::none;
+  }
+  return static_cast<double>(bodySize);


Converting uint64_t to double can lose precision for values larger than 2^53 (approximately 9 quadrillion bytes or 9 petabytes). While extremely unlikely in practice for HTTP body sizes, this could theoretically cause inaccurate body size reporting for very large bodies.

JavaScript's Number type uses IEEE 754 double precision, which can only accurately represent integers up to Number.MAX_SAFE_INTEGER (2^53 - 1). Beyond this, precision is lost. Consider documenting this limitation or using BigInt if precise representation of very large sizes is required.

Copilot · 2026-01-17T17:23:59Z

src/workerd/api/trace.h

    JSG_LAZY_READONLY_INSTANCE_PROPERTY(headers, getHeaders);
    JSG_LAZY_READONLY_INSTANCE_PROPERTY(method, getMethod);
    JSG_LAZY_READONLY_INSTANCE_PROPERTY(url, getUrl);
+    JSG_LAZY_READONLY_INSTANCE_PROPERTY(bodySize, getBodySize);


The new bodySize fields for request and response in trace events lack test coverage. Existing tail worker tests in this directory validate trace event structure but don't include assertions for the newly added bodySize properties on request/response objects.

Consider adding test cases that:

Verify bodySize is present and correct when Content-Length header exists

Verify bodySize is null/undefined when Content-Length is missing or invalid

Verify response bodySize is present when expectedBodySize is known

Test edge cases like zero-length bodies and very large sizes

Copilot · 2026-01-17T17:23:59Z

src/workerd/api/trace.h

    jsg::Optional<jsg::V8Ref<v8::Object>> cf;
    kj::Array<tracing::FetchEventInfo::Header> headers;
    kj::String method;
    kj::String url;


The naming convention is inconsistent between the internal representation and the JavaScript API:

Cap'n Proto schema uses requestSize for FetchEventInfo and bodySize for FetchResponseInfo

C++ internal structs use requestSize for requests and bodySize for responses

JavaScript API exposes both as bodySize on request and response objects

While this provides a consistent JavaScript API where both request.bodySize and response.bodySize are used, it creates a mismatch between the internal field names and the exposed property names for requests. Consider either:

Renaming the internal requestSize field to bodySize for consistency

Or documenting why different names are used (e.g., to distinguish at the protocol level that request size comes from Content-Length header)

Suggested change

kj::String url;

kj::String url;

// Note: This is the request body size, which is exposed to JavaScript as

// `bodySize` on the request object for API consistency with responses.

steebchen · 2026-01-17T17:25:35Z

I have read the CLA Document and I hereby sign the CLA

danlapid · 2026-01-17T17:31:08Z

Thanks for your contribution, I haven't fully reviewed but there is a problem with the approach.
Remember that HTTP request/response bodies can be very large and the actual size will only be known at the end of the body streaming.
The issue is that FetchEventInfo and FetchResponseInfo are both delivered too early to pass body sizes.
FetchEventInfo is delivered right as we're calling the fetch event info - before we fully drained the body
FetchResponseInfo is delivered right as the response is returned but before the body was drained.
To actually be able to deliver correct sizes you would have to add these fields to events delivered at the end of streaming.

This also shows an unfortunate lack of test coverage in this pull request, ideally I would love to see these cases covered correctly in tests to show the feature is working as expected.

Thank you again for your contribution and I hope my tips can help steer you in the right direction 🙏

Add requestSize and bodySize fields to FetchEventInfo and FetchResponseInfo so tail workers can access body size information. Request size is captured from the Content-Length header, and response size from expectedBodySize in ResponseSentTracker. Both are exposed as optional properties in JavaScript trace events. Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

- Fix Content-Length parsing to handle leading/trailing whitespace - Rename requestSize to bodySize in FetchEventInfo for naming consistency - Add precision limitation comments for uint64_t to double conversion - Add tests for FetchEventInfo bodySize serialization/deserialization - Add tests for FetchResponseInfo bodySize serialization/deserialization Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

steebchen · 2026-01-17T18:19:33Z

@danlapid thank you very much for the quick feedback. i'll have a closer look i just figured i'll make a quick prototype with claude

The previous implementation captured body size from Content-Length headers before streaming started. This is incorrect because: - Content-Length may not be present (chunked encoding) - Actual bytes streamed may differ from declared length - True body size is only known after streaming completes This change: - Adds ByteCountingOutputStream wrapper to track actual bytes written - Wraps response output stream in ResponseSentTracker::send() - Defers setReturn() call until proxyTask completes - Reports actual byte count instead of Content-Length header value The byte counter uses std::atomic for thread safety since proxyTask may run on a different thread than the original request handler. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

- Add BODYSIZE_STR constant to trace-stream.c++ - Update ToJs(FetchEventInfo) to include bodySize for request body - Update ToJs(FetchResponseInfo) to include bodySize for response body - Update tail-worker-test.js expected output with bodySize values Note: Integration tests (tail-worker-test@, http-test@) are failing with pre-existing segfaults unrelated to these changes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

jasnell · 2026-01-17T20:53:29Z

src/workerd/api/streams/writable-sink.h

+// A WritableSink wrapper that counts the total bytes written to the underlying sink.
+// This is used to track actual response body sizes for trace events.
+class ByteCountingWritableSink final: public WritableSinkWrapper {
+ public:


It doesn't appear like this is used at all in the PR?

jasnell · 2026-01-17T20:57:33Z

src/workerd/io/worker-entrypoint.c++

+    // Wrap with byte counting for trace events
+    auto counter = kj::refcounted<ResponseBodyByteCounter>();
+    byteCounter = kj::addRef(*counter);
+    return kj::heap<ByteCountingOutputStream>(kj::mv(stream), kj::mv(counter));


This is going to incur a performance penalty for all workers unnecessarily. We should only add the counter for workers that actually have tail workers enabled and will actually use the data.

jasnell · 2026-01-17T20:59:24Z

src/workerd/io/worker-entrypoint.c++

+// A refcounted byte counter that can be shared between the stream wrapper and
+// the code that needs to read the final byte count after streaming completes.
+struct ResponseBodyByteCounter: public kj::Refcounted {
+  std::atomic<uint64_t> bytesWritten{0};


Is the atomic really necessary? kj i/o objects are not used across multiple threads.

jasnell · 2026-01-17T21:02:57Z

src/workerd/io/worker-entrypoint.c++

    t.setEventInfo(*incomingRequest,
-        tracing::FetchEventInfo(method, kj::str(url), kj::mv(cfJson), kj::mv(traceHeadersArray)));
+        tracing::FetchEventInfo(
+            method, kj::str(url), kj::mv(cfJson), kj::mv(traceHeadersArray), bodySize));


This is going to be a bit confusing... When Content-Length is not specified, bodySize will end up being 0... which means we cannot differentiate between no content-length (and transfer-encoding of chunked) and content-length: 0 ... each of which have very different semantics.

jasnell · 2026-01-17T21:05:29Z

src/workerd/io/worker-entrypoint.c++

+      while (*end == ' ' || *end == '\t') end++;  // Skip trailing whitespace
+      if (end != ptr && *end == '\0') {
+        bodySize = parsed;
+      }


kj::StringPtr has a method parseAs<uint64_t>() that should be used instead of rolling your own parsing.

jasnell · 2026-01-17T21:06:34Z

src/workerd/io/worker-entrypoint.c++

+    // relying on Content-Length header. This is less critical than response body size
+    // since request bodies typically have accurate Content-Length.
+    uint64_t bodySize = 0;
+    KJ_IF_SOME(contentLength, headers.get(kj::HttpHeaderId::CONTENT_LENGTH)) {


This is going to incur an additional perf cost for every worker even those not using tail workers.

jasnell · 2026-01-17T21:07:42Z

src/workerd/io/worker-entrypoint.c++

-      }
+      deferredWorkerTracer = t;
+      deferredHttpResponseStatus = wrappedResponse.getHttpResponseStatus();
+      traceByteCounter = wrappedResponse.takeByteCounter();


this is changing the timing of when the setReturn event details are provided. This will need to be carefully evaluated. @fhanau @mar-cf

jasnell · 2026-01-17T21:10:04Z

src/workerd/io/worker-interface.capnp

      value @1 :Text;
    }
+    bodySize @4 :UInt64;
+    # Request body size in bytes. 0 if unknown or no body.


no content-length is not semantically equivalent to content-length: 0, we should not conflate the two.

jasnell · 2026-01-17T21:15:13Z

src/workerd/io/worker-entrypoint.c++

      return tracing::FetchEventInfo::Header(kj::mv(entry.key), kj::strArray(entry.value, ", "));
    };

+    // Extract request body size from Content-Length header if present.


Given that we're capturing the headers above in traceHeaders, do we really need to separately capture and parse the content-length here?

jasnell · 2026-01-17T21:30:18Z

src/workerd/io/worker-entrypoint.c++

+  }
+
+  // Returns a reference to the byte counter for deferred access after streaming.
+  // The returned ownership should be held until after proxyTask completes.


It doesn't return a reference... it transfers ownership. This likely needs protections against being called multiple times.

jasnell · 2026-01-17T21:36:41Z

src/workerd/io/worker-entrypoint.c++

+
+  kj::Promise<void> whenWriteDisconnected() override {
+    return inner->whenWriteDisconnected();
+  }


this likely should also include overrides for tryPumpFrom and abortWrite. The tryPumpFrom is going to be tricky as without it this is going to end up always falling back to a non-optimized pump, but then counting the bytes for an optimized pump is tricky. If tail stream is not enabled for a worker, we wouldn't want to force everything through this path.

@jasnell

Changes based on @jasnell's review: - Remove unused ByteCountingWritableSink from writable-sink.h - Only enable byte counting when tracing is enabled to avoid performance overhead for workers without tail workers - Remove std::atomic since kj I/O objects are single-threaded - Use kj::StringPtr::tryParseAs<uint64_t>() instead of custom strtoull parsing for Content-Length - Fix takeByteCounter() comment to indicate it transfers ownership - Add comment explaining why tryPumpFrom() is not overridden (need to count bytes, so use non-optimized write path) Note on bodySize semantics: 0 is used for both "no Content-Length" and "Content-Length: 0". While these are semantically different in HTTP, for trace events both indicate no/empty body data which is acceptable for monitoring purposes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

This change allows differentiating between: - kj::none: Body size is unknown (no Content-Length header) - Some(0): Body size is explicitly zero (Content-Length: 0) - Some(N): Body size is N bytes Changes: - Add hasBodySize boolean field to capnp schema for FetchEventInfo and FetchResponseInfo to indicate whether bodySize is valid - Update trace.h structs to use kj::Maybe<uint64_t> instead of uint64_t - Update trace.c++ serialization to read/write hasBodySize flag - Update trace-stream.c++ ToJs to only include bodySize when known - Update api/trace.h and api/trace.c++ for the new type - Update worker-entrypoint.c++ to pass kj::Maybe for bodySize - Add new test cases for zero bodySize vs unknown bodySize The JavaScript API returns null when bodySize is unknown, and the actual numeric value (including 0) when it's known. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

danlapid · 2026-01-21T15:43:50Z

Thanks for your contribution, I haven't fully reviewed but there is a problem with the approach. Remember that HTTP request/response bodies can be very large and the actual size will only be known at the end of the body streaming. The issue is that FetchEventInfo and FetchResponseInfo are both delivered too early to pass body sizes. FetchEventInfo is delivered right as we're calling the fetch event info - before we fully drained the body FetchResponseInfo is delivered right as the response is returned but before the body was drained. To actually be able to deliver correct sizes you would have to add these fields to events delivered at the end of streaming.

This also shows an unfortunate lack of test coverage in this pull request, ideally I would love to see these cases covered correctly in tests to show the feature is working as expected.

Thank you again for your contribution and I hope my tips can help steer you in the right direction 🙏

This comment still stands.

Addresses PR feedback: request body size should be tracked after the body is consumed, not at onset time using Content-Length header. - Remove bodySize from FetchEventInfo (onset event) - Add requestBodySize to FetchResponseInfo (return event) - Add ByteCountingInputStream to track actual request bytes consumed - Both request and response body sizes are now reported after streaming - Byte counting only enabled when tail workers are active (no perf impact) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

steebchen · 2026-01-22T13:51:39Z

@danlapid let me know if it's going into a better direction now. not sure if this feature is worth the complexity/extra steps it does, i was mostly experimenting with workerd. let me know if you think it makes sense to pursue this further

danlapid · 2026-01-22T15:58:06Z

@danlapid let me know if it's going into a better direction now. not sure if this feature is worth the complexity/extra steps it does, i was mostly experimenting with workerd. let me know if you think it makes sense to pursue this further

I think the current approach is still lacking.
FetchResponseInfo is emitted when the worker returns a response. At that point we still did not necessarily consume the body and definitely don't necessarily know the response body size.
I highly suggest trying to check the HTTP spec for information here.

It seems like the soonest we can actually have this information in a somewhat sensible way is the Outcome event.

Per HTTP spec, body size is only definitively known after the body has been fully transmitted. The Outcome event is emitted after all work completes (including body streaming), making it the correct place to report body sizes. Changes: - Remove body sizes from FetchResponseInfo (return event) - Add responseBodySize and requestBodySize to Outcome struct - Add setBodySizes() method to tracer for deferred body size reporting - Update streaming tail worker JS serialization for new schema - Update buffered tail worker API to read body sizes from Trace object Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

danlapid · 2026-01-22T21:33:34Z

@steebchen lmk when you finished cleaning it up, I'm pretty sure most of the changes on this PR are remnants of previous iterations and are not actually still strictly neccessary

Copilot AI review requested due to automatic review settings January 17, 2026 17:19

steebchen requested review from a team as code owners January 17, 2026 17:19

Copilot started reviewing on behalf of steebchen January 17, 2026 17:20 View session

steebchen closed this Jan 17, 2026

Copilot AI reviewed Jan 17, 2026

View reviewed changes

steebchen reopened this Jan 17, 2026

github-actions bot added a commit that referenced this pull request Jan 17, 2026

@steebchen has signed the CLA in #5925

8e62de1

steebchen and others added 2 commits January 17, 2026 17:39

steebchen force-pushed the steebchen/trace-body-sizes branch from 1535869 to 5206364 Compare January 17, 2026 17:47

steebchen and others added 2 commits January 17, 2026 19:23

jasnell reviewed Jan 17, 2026

View reviewed changes

steebchen and others added 2 commits January 21, 2026 13:00

Add request/response body size tracking to trace events #5925

Are you sure you want to change the base?

Add request/response body size tracking to trace events #5925

Conversation

steebchen commented Jan 17, 2026

Summary

Changes

Uh oh!

github-actions bot commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

steebchen commented Jan 17, 2026

Uh oh!

danlapid commented Jan 17, 2026

Uh oh!

steebchen commented Jan 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danlapid commented Jan 21, 2026

Uh oh!

steebchen commented Jan 22, 2026

Uh oh!

danlapid commented Jan 22, 2026

Uh oh!

danlapid commented Jan 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Jan 17, 2026 •

edited

Loading