Skip to content

Standard on-chain encoding scheme for Cadence values #2165

@psiemens

Description

@psiemens

Background

With the introduction of the Crypto library in Cadence, it is now possible to construct and verify hashes directly in a Cadence program. This is great and opens up many new use cases.

However, Cadence developers need to manually construct the [UInt8] byte array that is passed to the hash() function. For example, a developer could hash an Address and a UInt64 value like this:

let foo = 0xee82856bf20e2aa6
let bar: UInt64 = 42

// Construct a message by combining the byte representation of both values
let data = foo.toBytes().concat(bar.toBigEndianBytes())

let hash = HashAlgorithm.SHA3_256.hash(data)

This example is fairly simple, but things get more difficult when working with more complex data structures.


I've recently been working on a commit-reveal scheme for NFTs as part of Freshmint:

  • Each NFT is hashed off-chain in JavaScript. The hash is computed over all its metadata fields (e.g. name, image, etc).
  • The hashes are minted as NFTs and stored on chain.
  • At a later data, the full metadata for each NFT is published to the chain. The contract rejects the published metadata if it does not hash to the same value published at mint time.

To implement this, I had to create an encoding scheme that could convert a set of Cadence values to a deterministic and secure byte representation.

The encoding needed to be injective in order to avoid ambiguous cases. See this snippet from the Cadence implementation:

/// FreshmintEncoding is a set of utilities for encoding Cadence values as byte arrays.
///
/// Variable-length values (String, Int, UInt) include a fixed-size length prefix to
/// prevent distinct sets of values from encoding to the same byte sequence when concatenated.
///
/// For example, for a structure with two fields:
///
///  - foo: String
///  - bar: String
///
/// Without a length prefix, these instances would produce the same encoding:
///
///  - Instance A: { foo: "foo", bar: "bar" }
///  - Instance b: { foo: "foob", bar: "ar" }
///
/// when using an encoding function like this:
/// 
/// let encoding = foo.utf8.concat(bar.utf8)
///

Here's an example of how the FreshmintEncoding is used in a contract:

/// Encode this metadata object as a byte array.
///
/// This can be used to hash the metadata and verify its integrity.
///
pub fun encode(): [UInt8] {
    return self.salt
        .concat(FreshmintEncoding.encodeString(self.image))
        .concat(FreshmintEncoding.encodeUInt64(self.serialNumber))
        .concat(FreshmintEncoding.encodeString(self.name))
        .concat(FreshmintEncoding.encodeString(self.description))
        .concat(FreshmintEncoding.encodeString(self.shape))
        .concat(FreshmintEncoding.encodeString(self.color))
        .concat(FreshmintEncoding.encodeString(self.smile))
        .concat(FreshmintEncoding.encodeString(self.emboss))
        .concat(FreshmintEncoding.encodeString(self.outline))
        .concat(FreshmintEncoding.encodeString(self.birthmark))
        .concat(FreshmintEncoding.encodeString(self.redeemed))
}

pub fun hash(): [UInt8] {
    return HashAlgorithm.SHA3_256.hash(self.encode())
}

Limitations of my approach

  • Users still need to manually concatenate the result (see example above) because FreshmintEncoding only supports primitive values.
    • I didn't add functions to encode structs, resources, arrays or dictionaries. I think this would require a lot of reflection.
  • For dictionaries specifically, while it is possible to iterate and encode all values, it was difficult to reimplement Cadence's dictionary iteration ordering in JavaScript.

Feature request

It would be great if Cadence had a built-in encoding function to convert any value to a byte representation with the following properties:

  • Deterministic output (this is a given, but worth mentioning).
  • Injective for structs, resources, arrays and dictionaries.
  • Easy to port to client libraries (e.g. Go, JavaScript, Python, etc).
  • The encoding does not need to be reversible (e.g. like JSON), but it could be.

Suggested solutions

  • Expand the existing RLP contract to support encoding.
  • Expose JSON-Cadence to the runtime.

Prior art

Solidity provides the abi.encodePacked function, which is often used to construct hashes in Ethereum contracts.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions