Skip to content

fix: cap payload depth and improve serialized arg readability for Hawk#74

Open
pavelzotikov wants to merge 7 commits intomasterfrom
fix/stack-trace-serialization-globals-and-readability
Open

fix: cap payload depth and improve serialized arg readability for Hawk#74
pavelzotikov wants to merge 7 commits intomasterfrom
fix/stack-trace-serialization-globals-and-readability

Conversation

@pavelzotikov
Copy link
Copy Markdown
Contributor

@pavelzotikov pavelzotikov commented Apr 23, 2026

Summary
Makes event/stacktrace payload building safe for deeply nested and self-referential values (e.g. $GLOBALS in frame arguments) and keeps stack arguments aligned with the Hawk contract: a list of full strings like name = serializedValue, with the list length capped and each line bounded by byte length so events stay small and JSON-safe.

What changed
– Serializer::prepare — max nesting depth to stop runaway recursion on circular structures; output is JSON with JSON_UNESCAPED_UNICODE | JSON_PRETTY_PRINT for readable snapshots.
– EventPayloadBuilder — normalizeBacktrace builds frames with an allowlist of keys, depth-limits non-frame data via transformForJson, maps args to formatted arguments without duplicating raw args in additionalData, caps the number of argument lines per frame, and truncates each argument line to MAX_ARGUMENT_LINE_BYTES (including prebuilt arguments lines).
– StacktraceFrameBuilder — caps how many function arguments are included per frame and truncates long name = value lines to the same per-line byte limit.
– Tests — cover normalization, argument count limit (with more args than the cap), truncation of long prebuilt argument lines, and serializer depth/JSON behavior.

Harden stack-trace and event payload building when frames include deeply nested or self-referential data (e.g. $GLOBALS): cap recursion depth in Serializer::prepare and EventPayloadBuilder::transformForJson, limit arguments per frame and per-line size, and keep only safe frame keys. Improve readability of serialized argument values for the Hawk UI by using JSON_PRETTY_PRINT and inserting zero-width break opportunities in long string scalars (valid JSON, smaller horizontal overflow). Tests updated to compare structured JSON and to cover long-string soft breaks.
@pavelzotikov pavelzotikov requested a review from neSpecc as a code owner April 23, 2026 14:09
Comment thread src/EventPayloadBuilder.php Outdated
Comment thread src/Serializer.php Outdated
Comment thread src/StacktraceFrameBuilder.php Outdated
@neSpecc neSpecc requested a review from Copilot April 23, 2026 14:24
Co-authored-by: Peter <specc.dev@gmail.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Hawk event/stacktrace payload building against deeply nested and self-referential data while improving how serialized argument values display in the Hawk UI.

Changes:

  • Add max-depth caps to Serializer::prepare() and EventPayloadBuilder::transformForJson() to prevent runaway recursion.
  • Limit stacktrace argument count/line length and prevent raw args from being copied into additionalData.
  • Switch serializer output to JSON_PRETTY_PRINT and insert soft wrap opportunities into long scalar strings; update/add unit tests accordingly.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/Serializer.php Adds max-depth protection and soft-break insertion for long strings; uses JSON_PRETTY_PRINT.
src/EventPayloadBuilder.php Normalizes frames more defensively, formats/limits arguments, and depth-limits additional data transformation.
src/StacktraceFrameBuilder.php Adds argument count limiting and truncates long “name = value” lines.
tests/Unit/SerializerTest.php Updates assertions to compare decoded JSON structures and adds coverage for soft-break insertion + depth truncation.
tests/Unit/EventPayloadBuilderTest.php Adds targeted tests for backtrace normalization behavior (args handling, limits, truncation, deep additional data).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/Serializer.php Outdated
return implode($zwsp, $parts);
}

return implode($zwsp, str_split($value, $chunk));
Comment thread src/StacktraceFrameBuilder.php Outdated
Comment on lines 295 to 297
return substr($line, 0, self::MAX_ARGUMENT_LINE_BYTES - 3) . '...';
}

Comment thread src/EventPayloadBuilder.php Outdated
Comment on lines +269 to +273
if (strlen($line) <= $maxBytes) {
return $line;
}

return substr($line, 0, $maxBytes - 3) . '...';
Comment thread src/EventPayloadBuilder.php Outdated
Comment on lines +269 to +273
if (strlen($line) <= $maxBytes) {
return $line;
}

return substr($line, 0, $maxBytes - 3) . '...';
- Stop truncating the value side of `name = serializedValue`; only cap
  parameter names (MAX_ARGUMENT_NAME_BYTES).
- Prebuilt argument lines without ` = ` pass through untouched.
- Remove MAX_ARGUMENT_LINE_BYTES / MAX_ARGUMENT_VALUE_BYTES; update tests.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Hawk event/stacktrace payload generation against deeply nested and cyclic data structures, while reshaping stack argument formatting to better match Hawk’s expected arguments string-list contract.

Changes:

  • Add depth/cycle protection to Serializer::prepare() and switch serializer output to pretty-printed JSON.
  • Update EventPayloadBuilder::normalizeBacktrace() to map raw args into Hawk arguments strings and keep unknown frame fields in additionalData with depth limiting.
  • Extend StacktraceFrameBuilder with argument count limits and name-side truncation helpers; add tests for these behaviors.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/Serializer.php Adds max-depth and circular detection in prepare(), and uses JSON_PRETTY_PRINT output.
src/EventPayloadBuilder.php Normalizes backtraces by mapping argsarguments, skipping raw args in additionalData, and depth-limiting JSON transforms.
src/StacktraceFrameBuilder.php Caps argument count per frame and introduces helpers to truncate argument names / normalize prebuilt lines.
tests/Unit/SerializerTest.php Updates assertions to compare decoded JSON and adds tests for depth, circular arrays, and long scalar handling.
tests/Unit/EventPayloadBuilderTest.php Adds coverage for normalizeBacktrace() argument mapping, count limit, prebuilt-line handling, and deep additional data.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 283 to 287
if (is_array($value)) {
$result = [];
foreach ($value as $k => $v) {
$result[$k] = $this->transformForJson($v);
$result[$k] = $this->transformForJson($v, $depth + 1);
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

transformForJson() only enforces a max depth; it does not detect circular/self-referential arrays. For values like $GLOBALS (or any array with a self-reference), this will repeatedly re-walk the same structure until the depth limit and can still produce a very large nested payload and high CPU. Consider adding circular reference detection (similar to Serializer::prepare()), returning a sentinel like [circular] when an ancestor is encountered.

Copilot uses AI. Check for mistakes.
Comment on lines +234 to +238
/**
* Build Hawk `arguments`: string list like "name = serializedValue" (from raw `args` via Serializer).
* Limits the number of lines ({@see StacktraceFrameBuilder::MAX_FRAME_ARGUMENTS}). Serialized values are not
* length-truncated; only param names are capped ({@see StacktraceFrameBuilder::formatTruncatedArgumentLine});
* prebuilt strings are split on the first `" = "` with {@see StacktraceFrameBuilder::truncatePrebuiltArgumentLine}.
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions capping/truncating each stack argument line by byte length, but buildArgumentsList() explicitly keeps serialized values unbounded and the tests preserve very long prebuilt argument lines. This can still produce oversized events. Either implement a per-line byte cap (for both prebuilt arguments and formatted args) or update the PR description/contract expectations to match the current behavior.

Copilot uses AI. Check for mistakes.
Comment on lines +292 to +297
public static function formatTruncatedArgumentLine(string $name, string $serializedValue): string
{
$namePart = self::truncateUtf8StringToMaxBytes($name, self::MAX_ARGUMENT_NAME_BYTES);

return $namePart . ' = ' . $serializedValue;
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

formatTruncatedArgumentLine() only truncates the argument name; the serialized value (and therefore the full "name = value" line) can still be arbitrarily large (e.g., very long strings/arrays), which can defeat the goal of keeping payloads small. Consider enforcing an explicit max byte length for the entire line and applying it consistently to both formatted and prebuilt argument lines.

Copilot uses AI. Check for mistakes.
Comment on lines +301 to +308
* Lines without `" = "` are returned as-is (no length limit).
*/
public static function truncatePrebuiltArgumentLine(string $line): string
{
$separator = ' = ';
$position = strpos($line, $separator);
if ($position === false) {
return $line;
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

truncatePrebuiltArgumentLine() returns lines without the " = " delimiter unchanged (no length cap). That allows a single prebuilt argument entry to be extremely large and bypass any payload-size protections. Consider applying the same max-byte truncation to the whole line even when the delimiter is missing.

Suggested change
* Lines without `" = "` are returned as-is (no length limit).
*/
public static function truncatePrebuiltArgumentLine(string $line): string
{
$separator = ' = ';
$position = strpos($line, $separator);
if ($position === false) {
return $line;
* Lines without `" = "` are truncated as whole lines to the configured byte cap.
*/
public static function truncatePrebuiltArgumentLine(string $line): string
{
$separator = ' = ';
$position = strpos($line, $separator);
if ($position === false) {
return self::truncateUtf8StringToMaxBytes($line, self::MAX_ARGUMENT_NAME_BYTES);

Copilot uses AI. Check for mistakes.
Comment on lines 279 to 283
try {
$newArguments[] = sprintf('%s = %s', $name, $value);
$newArguments[] = self::formatTruncatedArgumentLine((string) $name, $value);
} catch (\Exception $e) {
// Ignore unknown types
}
Copy link

Copilot AI Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The try/catch around formatTruncatedArgumentLine() appears unnecessary: neither serializeValue() nor formatTruncatedArgumentLine() throws, so the catch block should be unreachable. Removing it would simplify the control flow (or, if an exception is expected, catch a specific exception from the actual throwing call).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants