Skip to content

Conversation

@ariesdevil
Copy link
Contributor

@ariesdevil ariesdevil commented Dec 25, 2025

Why?

This PR fixes a schema evolution issue with tuple structs.

Previously, tuple struct fields were sorted by type (same as named structs), which caused schema evolution to break when adding fields of different types.

For example, evolving struct Point(f64, u8) to struct Point(f64, u8, f64) would cause fields to be incorrectly matched during deserialization because the new f64 field would be sorted before u8.

What does this PR do?

  1. Introduce SortedField struct: A helper struct that preserves the original field index alongside the field reference. This allows us to correctly track field positions regardless of serialization order.

  2. Preserve tuple struct field order: For tuple structs, fields are no longer sorted by type. Instead, they maintain their original definition order ("0", "1", "2", ...). This ensures that field names consistently map to their positions, enabling proper schema evolution.

  3. Unify protocol for tuple and named structs: Both tuple structs and named structs now use the same underlying protocol (field name based matching), but with different field name strategies:

    • Named structs: use field identifiers as names (sorted by type for optimal layout)
    • Tuple structs: use positional indices as names (unsorted to preserve schema evolution)
  4. Add schema evolution tests: Comprehensive tests for tuple struct schema evolution, including:

    • Adding fields at the end
    • Removing fields from the end
    • Adding fields with different types (i64, u8, f64)

Related issues

Does this PR introduce any user-facing change?
[ ] Does this PR introduce any public API change?
[x] Does this PR introduce any binary protocol compatibility change?
Note: Yes, but since tuple struct support is just supported, I think no one(except me) is using this feature now : )

Benchmark

@ariesdevil ariesdevil changed the title refactor(rust): unify tuple struct and named struct protocol, and mak… refactor(rust): unify tuple struct and named struct protocol, and make schema evolution happy Dec 25, 2025
@ariesdevil ariesdevil force-pushed the feat/add-tuple-struct-support branch 2 times, most recently from f1e8c08 to 6a0b4aa Compare December 26, 2025 01:50
@ariesdevil ariesdevil force-pushed the feat/add-tuple-struct-support branch from 6a0b4aa to 1a47cc3 Compare December 26, 2025 01:56
@ariesdevil
Copy link
Contributor Author

@chaokunyang Fixed, PTAL again.

Copy link
Collaborator

@chaokunyang chaokunyang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chaokunyang chaokunyang merged commit b4f0908 into apache:main Dec 26, 2025
52 checks passed
@ariesdevil ariesdevil deleted the feat/add-tuple-struct-support branch December 26, 2025 02:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants