Skip to content

feature: toindefarray struct tag for Cardano Plutus Data wire-format compatibility #762

@tdnguyenND

Description

@tdnguyenND

Context

This issue follows up on #756 (closed), per @fxamacker's suggestion to open an issue with specification context.

Apologies for two things upfront:

  1. Submitting the PR before opening an issue — I did not follow CONTRIBUTING correctly.
  2. The delayed reply — I wanted to gather the specification context you asked for properly rather than respond without it.

Motivation: externally-mandated wire format

The feature is not about choosing indefinite-length for its own sake. It is about matching an externally-mandated wire format that Go clients cannot opt out of: Cardano on-chain Plutus Data.

Cardano's cardano-node (via Haskell's cborg library) serializes PlutusData lists and Constr as indefinite-length CBOR arrays (0x9f … 0xff), with one exception (empty list → 0x80). This is documented in:

Why it matters — failure mode

Cardano computes the script integrity hash over the exact bytes of the serialized data (the ledger memoizes original binary; it does not canonicalize). If a Go client produces definite-length CBOR for a PlutusList, the hash differs from what cardano-node expects → transaction submission fails with NonOutputSupplimentaryDatums / mismatched datum hash. Plutus script validation then rejects the transaction.

This is not a preference; it is a wire-format requirement imposed by an existing production blockchain.

CDDL reference

From the Plutus Data CDDL in cardano-ledger/conway:

plutus_data =
    constr<plutus_data>
  / { * plutus_data => plutus_data }
  / [ * plutus_data ]
  / big_int
  / bounded_bytes

constr<a> =
    #6.121([* a]) / #6.122([* a]) / ... / #6.127([* a])
  / #6.1280([* a]) / ... / #6.1400([* a])
  / #6.102([uint, [* a]])

CDDL itself does not prescribe definite vs indefinite encoding (per RFC 8610). However, the operational wire format chosen by cardano-node is indefinite-length for non-empty arrays, and clients must match it bit-for-bit.

Ecosystem evidence — Go users already hand-rolling this

Several Go Cardano libraries already implement this workaround manually because fxamacker/cbor lacks the declarative option:

Each of these libraries has to write MarshalCBOR for every Plutus struct type, which duplicates the role that toarray already plays declaratively for definite-length.

On the "use cbor.Marshaler workaround" suggestion

The workaround is functional and I appreciate the pointer to StartIndefiniteArray/EndIndefinite. My concern is ergonomic scaling: a realistic Cardano dApp has tens or hundreds of datum/redeemer structs. Each would need an 8–10 line hand-written MarshalCBOR that must stay in sync with the struct fields (easy to drift when adding fields; no compiler help). plutigo's ~60 LOC helper + per-type MarshalCBOR is the public proof of this cost.

The proposed toindefarray tag is symmetric with the existing toarray — same code paths, same cache, same semantics, just the wire-form header byte differs.

Addressing security considerations

You raised valid concerns about indefinite-length and security. A few points on how the proposed feature limits the blast radius:

  1. Encoding-only, no decoder change. PR feat: add toindefarray struct tag option for indefinite-length CBOR arrays #756 only adds an encoding path. The decoder already accepts indefinite-length arrays for toarray structs via getHeadWithIndefiniteLengthFlag() — no new decode surface area is introduced.
  2. Opt-in per-struct via tag. Users who do not add toindefarray see zero behavior change.
  3. Honors existing IndefLengthForbidden mode. The implementation can check em.indefLength == IndefLengthForbidden and return the existing IndefiniteLengthError, consistent with how Encoder.StartIndefiniteArray() already behaves (stream.go:247–250). This preserves the defense that profiles like CTAP2 Canonical and Core Deterministic Encoding rely on (per RFC 8949 §4.2, indefinite-length items are not allowed in deterministic encoding — that invariant is retained).
  4. No new indefinite-length behavior on the wire. The library already produces indefinite-length arrays when the user explicitly calls StartIndefiniteArray(). The feature just exposes that capability declaratively.
  5. No impact on users not using the option — aligning with the CONTRIBUTING guidance that PRs "should not reduce speed, increase memory use, reduce security, etc. for people not using the new option."

Proposed scope

Minimal change, matching what was in #756:

  • Add toindefarray as a struct tag option (parallel to toarray).
  • Encoding emits 0x9f … 0xff instead of 82 … / 83 ….
  • Decoding unchanged (already handles both forms).
  • Respects IndefLengthForbidden.

Open questions for your guidance:

  • Would you prefer the option to always check IndefLengthForbidden, or to gate it additionally behind a new EncOptions flag?
  • Should the CTAP2 preset explicitly forbid toindefarray? It already sets IndefLengthForbidden, so this would be automatic with the check above.
  • Any additional test coverage you would like to see (round-trip with nested structs, empty-array edge case matching Cardano's rule, rejection when IndefLengthForbidden)?

Next steps

If this issue is welcomed, I will open a fresh PR addressing:

  • Full PR checklist form with DCO sign-off (commits signed with Signed-off-by:).
  • The IndefLengthForbidden check.
  • Additional tests per your guidance.
  • Link to this issue.

Thank you for your time reviewing this request and for maintaining fxamacker/cbor — it's the de-facto CBOR library in Go and Cardano Go tooling depends on it.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions