fix: cap payload depth and improve serialized arg readability for Hawk#74
fix: cap payload depth and improve serialized arg readability for Hawk#74pavelzotikov wants to merge 7 commits intomasterfrom
Conversation
Harden stack-trace and event payload building when frames include deeply nested or self-referential data (e.g. $GLOBALS): cap recursion depth in Serializer::prepare and EventPayloadBuilder::transformForJson, limit arguments per frame and per-line size, and keep only safe frame keys. Improve readability of serialized argument values for the Hawk UI by using JSON_PRETTY_PRINT and inserting zero-width break opportunities in long string scalars (valid JSON, smaller horizontal overflow). Tests updated to compare structured JSON and to cover long-string soft breaks.
Co-authored-by: Peter <specc.dev@gmail.com>
There was a problem hiding this comment.
Pull request overview
This PR hardens Hawk event/stacktrace payload building against deeply nested and self-referential data while improving how serialized argument values display in the Hawk UI.
Changes:
- Add max-depth caps to
Serializer::prepare()andEventPayloadBuilder::transformForJson()to prevent runaway recursion. - Limit stacktrace argument count/line length and prevent raw
argsfrom being copied intoadditionalData. - Switch serializer output to
JSON_PRETTY_PRINTand insert soft wrap opportunities into long scalar strings; update/add unit tests accordingly.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/Serializer.php |
Adds max-depth protection and soft-break insertion for long strings; uses JSON_PRETTY_PRINT. |
src/EventPayloadBuilder.php |
Normalizes frames more defensively, formats/limits arguments, and depth-limits additional data transformation. |
src/StacktraceFrameBuilder.php |
Adds argument count limiting and truncates long “name = value” lines. |
tests/Unit/SerializerTest.php |
Updates assertions to compare decoded JSON structures and adds coverage for soft-break insertion + depth truncation. |
tests/Unit/EventPayloadBuilderTest.php |
Adds targeted tests for backtrace normalization behavior (args handling, limits, truncation, deep additional data). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| return implode($zwsp, $parts); | ||
| } | ||
|
|
||
| return implode($zwsp, str_split($value, $chunk)); |
| return substr($line, 0, self::MAX_ARGUMENT_LINE_BYTES - 3) . '...'; | ||
| } | ||
|
|
| if (strlen($line) <= $maxBytes) { | ||
| return $line; | ||
| } | ||
|
|
||
| return substr($line, 0, $maxBytes - 3) . '...'; |
| if (strlen($line) <= $maxBytes) { | ||
| return $line; | ||
| } | ||
|
|
||
| return substr($line, 0, $maxBytes - 3) . '...'; |
https://github.com/codex-team/hawk.php into fix/stack-trace-serialization-globals-and-readability
- Stop truncating the value side of `name = serializedValue`; only cap parameter names (MAX_ARGUMENT_NAME_BYTES). - Prebuilt argument lines without ` = ` pass through untouched. - Remove MAX_ARGUMENT_LINE_BYTES / MAX_ARGUMENT_VALUE_BYTES; update tests.
There was a problem hiding this comment.
Pull request overview
This PR hardens Hawk event/stacktrace payload generation against deeply nested and cyclic data structures, while reshaping stack argument formatting to better match Hawk’s expected arguments string-list contract.
Changes:
- Add depth/cycle protection to
Serializer::prepare()and switch serializer output to pretty-printed JSON. - Update
EventPayloadBuilder::normalizeBacktrace()to map rawargsinto Hawkargumentsstrings and keep unknown frame fields inadditionalDatawith depth limiting. - Extend
StacktraceFrameBuilderwith argument count limits and name-side truncation helpers; add tests for these behaviors.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
src/Serializer.php |
Adds max-depth and circular detection in prepare(), and uses JSON_PRETTY_PRINT output. |
src/EventPayloadBuilder.php |
Normalizes backtraces by mapping args→arguments, skipping raw args in additionalData, and depth-limiting JSON transforms. |
src/StacktraceFrameBuilder.php |
Caps argument count per frame and introduces helpers to truncate argument names / normalize prebuilt lines. |
tests/Unit/SerializerTest.php |
Updates assertions to compare decoded JSON and adds tests for depth, circular arrays, and long scalar handling. |
tests/Unit/EventPayloadBuilderTest.php |
Adds coverage for normalizeBacktrace() argument mapping, count limit, prebuilt-line handling, and deep additional data. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (is_array($value)) { | ||
| $result = []; | ||
| foreach ($value as $k => $v) { | ||
| $result[$k] = $this->transformForJson($v); | ||
| $result[$k] = $this->transformForJson($v, $depth + 1); | ||
| } |
There was a problem hiding this comment.
transformForJson() only enforces a max depth; it does not detect circular/self-referential arrays. For values like $GLOBALS (or any array with a self-reference), this will repeatedly re-walk the same structure until the depth limit and can still produce a very large nested payload and high CPU. Consider adding circular reference detection (similar to Serializer::prepare()), returning a sentinel like [circular] when an ancestor is encountered.
| /** | ||
| * Build Hawk `arguments`: string list like "name = serializedValue" (from raw `args` via Serializer). | ||
| * Limits the number of lines ({@see StacktraceFrameBuilder::MAX_FRAME_ARGUMENTS}). Serialized values are not | ||
| * length-truncated; only param names are capped ({@see StacktraceFrameBuilder::formatTruncatedArgumentLine}); | ||
| * prebuilt strings are split on the first `" = "` with {@see StacktraceFrameBuilder::truncatePrebuiltArgumentLine}. |
There was a problem hiding this comment.
The PR description mentions capping/truncating each stack argument line by byte length, but buildArgumentsList() explicitly keeps serialized values unbounded and the tests preserve very long prebuilt argument lines. This can still produce oversized events. Either implement a per-line byte cap (for both prebuilt arguments and formatted args) or update the PR description/contract expectations to match the current behavior.
| public static function formatTruncatedArgumentLine(string $name, string $serializedValue): string | ||
| { | ||
| $namePart = self::truncateUtf8StringToMaxBytes($name, self::MAX_ARGUMENT_NAME_BYTES); | ||
|
|
||
| return $namePart . ' = ' . $serializedValue; | ||
| } |
There was a problem hiding this comment.
formatTruncatedArgumentLine() only truncates the argument name; the serialized value (and therefore the full "name = value" line) can still be arbitrarily large (e.g., very long strings/arrays), which can defeat the goal of keeping payloads small. Consider enforcing an explicit max byte length for the entire line and applying it consistently to both formatted and prebuilt argument lines.
| * Lines without `" = "` are returned as-is (no length limit). | ||
| */ | ||
| public static function truncatePrebuiltArgumentLine(string $line): string | ||
| { | ||
| $separator = ' = '; | ||
| $position = strpos($line, $separator); | ||
| if ($position === false) { | ||
| return $line; |
There was a problem hiding this comment.
truncatePrebuiltArgumentLine() returns lines without the " = " delimiter unchanged (no length cap). That allows a single prebuilt argument entry to be extremely large and bypass any payload-size protections. Consider applying the same max-byte truncation to the whole line even when the delimiter is missing.
| * Lines without `" = "` are returned as-is (no length limit). | |
| */ | |
| public static function truncatePrebuiltArgumentLine(string $line): string | |
| { | |
| $separator = ' = '; | |
| $position = strpos($line, $separator); | |
| if ($position === false) { | |
| return $line; | |
| * Lines without `" = "` are truncated as whole lines to the configured byte cap. | |
| */ | |
| public static function truncatePrebuiltArgumentLine(string $line): string | |
| { | |
| $separator = ' = '; | |
| $position = strpos($line, $separator); | |
| if ($position === false) { | |
| return self::truncateUtf8StringToMaxBytes($line, self::MAX_ARGUMENT_NAME_BYTES); |
| try { | ||
| $newArguments[] = sprintf('%s = %s', $name, $value); | ||
| $newArguments[] = self::formatTruncatedArgumentLine((string) $name, $value); | ||
| } catch (\Exception $e) { | ||
| // Ignore unknown types | ||
| } |
There was a problem hiding this comment.
The try/catch around formatTruncatedArgumentLine() appears unnecessary: neither serializeValue() nor formatTruncatedArgumentLine() throws, so the catch block should be unreachable. Removing it would simplify the control flow (or, if an exception is expected, catch a specific exception from the actual throwing call).
Summary
Makes event/stacktrace payload building safe for deeply nested and self-referential values (e.g. $GLOBALS in frame arguments) and keeps stack arguments aligned with the Hawk contract: a list of full strings like name = serializedValue, with the list length capped and each line bounded by byte length so events stay small and JSON-safe.
What changed
– Serializer::prepare — max nesting depth to stop runaway recursion on circular structures; output is JSON with JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT for readable snapshots.
– EventPayloadBuilder — normalizeBacktrace builds frames with an allowlist of keys, depth-limits non-frame data via transformForJson, maps args to formatted arguments without duplicating raw args in additionalData, caps the number of argument lines per frame, and truncates each argument line to MAX_ARGUMENT_LINE_BYTES (including prebuilt arguments lines).
– StacktraceFrameBuilder — caps how many function arguments are included per frame and truncates long name = value lines to the same per-line byte limit.
– Tests — cover normalization, argument count limit (with more args than the cap), truncation of long prebuilt argument lines, and serializer depth/JSON behavior.