Skip to content

Silent type coercion in SHAPE normalization masks malformed-spec type errors #263

Description

@amc-corey-cox

Malformed-type values in a transformation spec are silently coerced to strings during the SHAPE phase (ReferenceValidator.normalize()) instead of surfacing as validation errors. Pydantic alone rejects all of these; ReferenceValidator coerces them before the model ever sees them:

Spec input After SHAPE Pydantic-alone result
sources: 5 ["5"] ValidationError
sources: {a: 1} ["1"] ValidationError
object_derivations: "oops" ["oops"] (then flattened to []) ValidationError

ensure_list is doing two jobs at once: the desirable scalar→list convenience ("lr"["lr"], which the multivalued populated_from feature relies on) and an undesirable coercion of genuinely wrong types that hides the error.

This is a SHAPE/ReferenceValidator layer-wide behavior affecting every multivalued field, not specific to sources or object_derivations. Low priority — it only bites on actively malformed YAML, the common scalar/number-string case coerces to something sensible, and it predates the current normalization work.

Possible directions:

  • Add a pre-SHAPE type check that rejects values that are neither a scalar of the expected type nor a list of them, before ensure_list coerces them.
  • Or a strict mode that distinguishes "scalar → one-element list" convenience from "wrong type" and fails loudly on the latter.

Surfaced during review of #250 (Copilot flagged the _flatten_ods_in_class_deriv and _migrate_pv_sources_to_populated_from MIGRATE-phase helpers, but the coercion actually happens upstream in SHAPE).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions