Skip to content

Conversation

@SubhamSinghal
Copy link

No description provided.

@SubhamSinghal SubhamSinghal changed the title fix: string escape (#868) feat: to_char expression (#868) Sep 14, 2025
@shehabgamin shehabgamin added the run spark tests Trigger Spark tests on a pull request label Sep 14, 2025
Copy link
Contributor

@shehabgamin shehabgamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SubhamSinghal Thank you so much for this contribution! Exciting progress

Ok(ColumnarValue::Scalar(ScalarValue::Utf8(Some(s))))
}
}
_ => not_impl_err!("to_char currently supports only scalar values with literal format string"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also need to account for array input:

(ColumnarValue::Array(array), ColumnarValue::Array(array))
(ColumnarValue::Array(array), ColumnarValue::Scalar(ScalarValue::Utf8(Some(fmt))))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shehabgamin need some help in fixing rust error

@github-actions
Copy link

github-actions bot commented Sep 14, 2025

Gold Data Report

Notes
  1. The tables below show the number of true positives (TP), true negatives (TN), false positives (FP), and false negatives (FN) in gold data input processing.
  2. A positive input is a valid test case, while a negative input is a test case that is expected to fail.

Commit Information

Commit Revision Branch
After 60a2f13 refs/pull/874/merge
Before 92083e4 main

Summary

Commit TP TN FP FN Total
After 1898 194 44 379 2515
Before 1893 194 44 384 2515

Details

Gold Data Metrics
Group File Commit TP TN FP FN Total
spark data_type.json After 43 5 0 0 48
Before 43 5 0 0 48
expression/case.json After 5 0 0 0 5
Before 5 0 0 0 5
expression/cast.json After 4 0 0 0 4
Before 4 0 0 0 4
expression/current.json After 2 0 0 0 2
Before 2 0 0 0 2
expression/date.json After 4 0 1 0 5
Before 4 0 1 0 5
expression/interval.json After 346 4 1 0 351
Before 346 4 1 0 351
expression/large.json After 2 0 0 0 2
Before 2 0 0 0 2
expression/like.json After 29 10 0 0 39
Before 29 10 0 0 39
expression/misc.json After 109 5 0 1 115
Before 109 5 0 1 115
expression/numeric.json After 31 6 1 0 38
Before 31 6 1 0 38
expression/string.json After 18 1 0 0 19
Before 18 1 0 0 19
expression/timestamp.json After 7 0 3 0 10
Before 7 0 3 0 10
expression/window.json After 73 0 1 0 74
Before 73 0 1 0 74
function/agg.json After 133 0 0 36 169
Before 133 0 0 36 169
function/array.json After 42 0 0 2 44
Before 42 0 0 2 44
function/bitwise.json After 15 0 0 0 15
Before 15 0 0 0 15
function/collection.json After 12 0 0 0 12
Before 12 0 0 0 12
function/conditional.json After 15 0 0 0 15
Before 15 0 0 0 15
function/conversion.json After 2 0 0 0 2
Before 2 0 0 0 2
function/csv.json After 2 0 0 3 5
Before 2 0 0 3 5
function/datetime.json After 121 0 0 24 145
Before 121 0 0 24 145
function/generator.json After 7 0 0 6 13
Before 7 0 0 6 13
function/hash.json After 5 0 0 2 7
Before 5 0 0 2 7
function/json.json After 7 0 0 13 20
Before 7 0 0 13 20
function/lambda.json After 0 0 0 31 31
Before 0 0 0 31 31
function/map.json After 11 0 0 0 11
Before 11 0 0 0 11
function/math.json After 122 0 0 2 124
Before 122 0 0 2 124
function/misc.json After 31 0 0 22 53
Before 31 0 0 22 53
function/predicate.json After 70 0 0 9 79
Before 70 0 0 9 79
function/string.json After 161 0 0 43 204
Before 156 0 0 48 204
function/struct.json After 2 0 0 0 2
Before 2 0 0 0 2
function/url.json After 9 0 0 1 10
Before 9 0 0 1 10
function/variant.json After 0 0 0 28 28
Before 0 0 0 28 28
function/window.json After 6 0 0 3 9
Before 6 0 0 3 9
function/xml.json After 0 0 0 17 17
Before 0 0 0 17 17
plan/ddl_alter_table.json After 49 14 3 11 77
Before 49 14 3 11 77
plan/ddl_alter_view.json After 5 1 0 0 6
Before 5 1 0 0 6
plan/ddl_analyze_table.json After 17 6 0 0 23
Before 17 6 0 0 23
plan/ddl_cache.json After 4 0 1 0 5
Before 4 0 1 0 5
plan/ddl_create_index.json After 0 0 0 3 3
Before 0 0 0 3 3
plan/ddl_create_table.json After 27 30 8 40 105
Before 27 30 8 40 105
plan/ddl_delete_from.json After 2 1 0 0 3
Before 2 1 0 0 3
plan/ddl_describe.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/ddl_drop_index.json After 0 0 0 2 2
Before 0 0 0 2 2
plan/ddl_drop_view.json After 5 0 0 0 5
Before 5 0 0 0 5
plan/ddl_insert_into.json After 16 1 1 0 18
Before 16 1 1 0 18
plan/ddl_insert_overwrite.json After 9 0 2 0 11
Before 9 0 2 0 11
plan/ddl_load_data.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/ddl_merge_into.json After 8 4 3 0 15
Before 8 4 3 0 15
plan/ddl_misc.json After 9 0 0 1 10
Before 9 0 0 1 10
plan/ddl_replace_table.json After 23 14 7 40 84
Before 23 14 7 40 84
plan/ddl_select.json After 1 0 0 0 1
Before 1 0 0 0 1
plan/ddl_show_views.json After 7 0 0 0 7
Before 7 0 0 0 7
plan/ddl_uncache.json After 2 0 0 0 2
Before 2 0 0 0 2
plan/ddl_update.json After 2 1 0 0 3
Before 2 1 0 0 3
plan/error_alter_table.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/error_analyze_table.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_create_table.json After 0 6 0 0 6
Before 0 6 0 0 6
plan/error_describe.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_join.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/error_load_data.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/error_misc.json After 0 14 0 0 14
Before 0 14 0 0 14
plan/error_order_by.json After 1 4 0 0 5
Before 1 4 0 0 5
plan/error_select.json After 0 15 0 0 15
Before 0 15 0 0 15
plan/error_with.json After 0 1 0 0 1
Before 0 1 0 0 1
plan/plan_alter_view.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/plan_create_view.json After 0 2 0 0 2
Before 0 2 0 0 2
plan/plan_explain.json After 0 1 1 0 2
Before 0 1 1 0 2
plan/plan_group_by.json After 9 1 0 1 11
Before 9 1 0 1 11
plan/plan_hint.json After 25 0 3 0 28
Before 25 0 3 0 28
plan/plan_insert_into.json After 3 0 0 0 3
Before 3 0 0 0 3
plan/plan_insert_overwrite.json After 2 0 0 0 2
Before 2 0 0 0 2
plan/plan_join.json After 53 2 1 6 62
Before 53 2 1 6 62
plan/plan_misc.json After 15 4 0 10 29
Before 15 4 0 10 29
plan/plan_order_by.json After 15 5 1 10 31
Before 15 5 1 10 31
plan/plan_select.json After 86 15 5 12 118
Before 86 15 5 12 118
plan/plan_set_operation.json After 17 0 0 0 17
Before 17 0 0 0 17
plan/plan_with.json After 6 0 1 0 7
Before 6 0 1 0 7
plan/unpivot_join.json After 4 0 0 0 4
Before 4 0 0 0 4
plan/unpivot_select.json After 14 6 0 0 20
Before 14 6 0 0 20
table_schema.json After 8 6 0 0 14
Before 8 6 0 0 14

@linhr linhr changed the title feat: to_char expression (#868) feat: to_char expression Sep 15, 2025
@linhr linhr marked this pull request as draft September 15, 2025 05:59
@linhr
Copy link
Contributor

linhr commented Sep 15, 2025

Thanks for the contribution!

You can run the following commands locally for formatting and linting:

  • cargo +nightly fmt
  • cargo clippy --all-targets --all-features

@linhr linhr requested a review from Copilot September 15, 2025 12:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements the to_char expression for formatting numeric values to strings in Spark SQL, enabling conversion of numbers to formatted string representations using format patterns.

  • Implementation of the to_char function with support for various numeric format patterns including zero-padding, optional digits, decimal points, grouping separators, and sign indicators
  • Addition of comprehensive test coverage through golden test files demonstrating various formatting scenarios
  • Integration of the new function into the execution engine with proper registration and type handling

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
python/pysail/tests/spark/function/test_to_char.txt Test examples showing to_char function behavior with various format patterns
crates/sail-spark-connect/tests/gold_data/function/string.json Updates golden test expectations from failure to success for to_char test cases
crates/sail-plan/src/extension/function/string/spark_to_char.rs Core implementation of the SparkToChar function with number formatting logic
crates/sail-plan/src/extension/function/string/mod.rs Module registration for the new to_char function and format tokens
crates/sail-plan/src/extension/function/string/format_tokens.rs Token parsing logic for format string patterns
crates/sail-execution/src/codec.rs Function registration and integration into the execution engine

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@linhr
Copy link
Contributor

linhr commented Sep 18, 2025

Nice progress!

Could you merge from main and resolve the conflict in codec.rs? Then we can run the workflow and see how it goes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run spark tests Trigger Spark tests on a pull request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants