Summary
When converting a Substrait plan from JSON to text format, the extensionUriReference field on extension functions (and likely types) is ignored. The formatter always displays URI anchor 0 regardless of the actual reference value, then emits an error about the missing anchor.
Reproduction
Save this as repro.json:
{
"extensionUris": [
{ "extensionUriAnchor": 1, "uri": "/functions_comparison.yaml" }
],
"extensions": [
{
"extensionFunction": {
"extensionUriReference": 1,
"functionAnchor": 1,
"name": "is_not_null:any"
}
}
],
"relations": [
{
"root": {
"input": {
"filter": {
"common": { "direct": {} },
"input": {
"read": {
"common": { "direct": {} },
"baseSchema": {
"names": ["x"],
"struct": {
"types": [{ "i64": { "nullability": "NULLABILITY_NULLABLE" } }],
"nullability": "NULLABILITY_REQUIRED"
}
},
"namedTable": { "names": ["my_table"] }
}
},
"condition": {
"scalarFunction": {
"functionReference": 1,
"outputType": { "bool": { "nullability": "NULLABILITY_REQUIRED" } },
"arguments": [{ "value": { "selection": { "directReference": { "structField": { "field": 0 } }, "rootReference": {} } } }]
}
}
}
},
"names": ["x"]
}
}
],
"version": { "majorNumber": 0, "minorNumber": 48, "patchNumber": 0 }
}
Then run:
substrait-explain convert -i repro.json -t text
Expected output
=== Extensions
Functions:
# 1 @ 1: is_not_null:any
=== Plan
Root[x]
Filter[is_not_null($0) => $0]
Read[my_table => x:i64?]
Actual output
=== Extensions
Functions:
# 1 @ 0: is_not_null:any
=== Plan
Root[x]
Filter[is_not_null($0) => $0]
Read[my_table => x:i64?]
Formatting issues:
Error adding simple extension: Missing URN anchor 0 for extension Function anchor 1 name is_not_null:any
The function is displayed as # 1 @ 0 when it should be # 1 @ 1.
Notes
- JSON→JSON round-trip preserves
extensionUriReference: 1 correctly, so protobuf deserialization is fine. The bug is in the text formatting path.
- This also likely affects
extensionType and extensionTypeVariation declarations, which have the same extensionUriReference field.
Summary
When converting a Substrait plan from JSON to text format, the
extensionUriReferencefield on extension functions (and likely types) is ignored. The formatter always displays URI anchor0regardless of the actual reference value, then emits an error about the missing anchor.Reproduction
Save this as
repro.json:{ "extensionUris": [ { "extensionUriAnchor": 1, "uri": "/functions_comparison.yaml" } ], "extensions": [ { "extensionFunction": { "extensionUriReference": 1, "functionAnchor": 1, "name": "is_not_null:any" } } ], "relations": [ { "root": { "input": { "filter": { "common": { "direct": {} }, "input": { "read": { "common": { "direct": {} }, "baseSchema": { "names": ["x"], "struct": { "types": [{ "i64": { "nullability": "NULLABILITY_NULLABLE" } }], "nullability": "NULLABILITY_REQUIRED" } }, "namedTable": { "names": ["my_table"] } } }, "condition": { "scalarFunction": { "functionReference": 1, "outputType": { "bool": { "nullability": "NULLABILITY_REQUIRED" } }, "arguments": [{ "value": { "selection": { "directReference": { "structField": { "field": 0 } }, "rootReference": {} } } }] } } } }, "names": ["x"] } } ], "version": { "majorNumber": 0, "minorNumber": 48, "patchNumber": 0 } }Then run:
Expected output
Actual output
The function is displayed as
# 1 @ 0when it should be# 1 @ 1.Notes
extensionUriReference: 1correctly, so protobuf deserialization is fine. The bug is in the text formatting path.extensionTypeandextensionTypeVariationdeclarations, which have the sameextensionUriReferencefield.