Skip to content

Conversation

@kosiew
Copy link
Contributor

@kosiew kosiew commented Nov 11, 2025

Which issue does this PR close?

Closes #18109.

Rationale for this change

Previously, the SQL planner accepted WITHIN GROUP clauses for all aggregate UDAFs, even those that did not explicitly support ordered-set semantics. This behavior was too permissive and inconsistent with PostgreSQL. For example, queries such as SUM(x) WITHIN GROUP (ORDER BY x) were allowed, even though SUM is not an ordered-set aggregate.

This PR enforces stricter validation so that only UDAFs that explicitly return true from supports_within_group_clause() may use WITHIN GROUP. All other aggregates now produce a clear planner error when this syntax is used.

What changes are included in this PR?

  • Added type alias WithinGroupExtraction to simplify complex tuple return types used by helper functions.

  • Introduced a new helper method extract_and_prepend_within_group_args to centralize logic for handling WITHIN GROUP argument rewriting.

  • Updated the planner to:

    • Validate that only UDAFs with supports_within_group_clause() can accept WITHIN GROUP.
    • Prepend WITHIN GROUP ordering expressions to function arguments only for supported ordered-set aggregates.
    • Produce clear error messages when WITHIN GROUP is used incorrectly.
  • Added comprehensive unit tests verifying correct behavior and failure cases:

    • WITHIN GROUP rejected for non-ordered-set aggregates (MIN, SUM, etc.).
    • WITHIN GROUP accepted for ordered-set aggregates such as percentile_cont.
    • Validation for named arguments, multiple ordering expressions, and semantic conflicts with OVER clauses.
  • Updated SQL logic tests (aggregate.slt) to reflect new rejection behavior.

  • Updated documentation:

    • aggregate_functions.md and developer docs to clarify when and how WITHIN GROUP can be used.
    • upgrading.md to inform users of this stricter enforcement and migration guidance.

Are these changes tested?

✅ Yes.

  • New tests in sql_integration.rs validate acceptance, rejection, and argument behavior of WITHIN GROUP for both valid and invalid cases.
  • SQL logic tests (aggregate.slt) include negative test cases confirming planner rejections.

Are there any user-facing changes?

✅ Yes.

  • Users attempting to use WITHIN GROUP with regular aggregates (e.g. SUM, AVG, MIN, MAX) will now see a planner error:

    WITHIN GROUP is only supported for ordered-set aggregate functions

  • Documentation has been updated to clearly describe WITHIN GROUP semantics and provide examples of valid and invalid usage.

No API-breaking changes were introduced; only stricter planner validation and improved error messaging.

Ensure SQL WITHIN GROUP(...) is only allowed for UDAFs
that opt into ordered-set semantics. Cached the support
check locally and updated the condition to return an error
when the clause is incorrectly used. Added a unit test to
verify this rejection for non-ordered-set aggregate functions.
…UDAFs and enhance upgrade documentation for stricter planner enforcement
Refactor within_group_rejected_for_non_ordered_set_udaf to
extract the planner error using .expect_err(...).to_string()
to avoid brittle Debug formatting issues.

Add within_group_allowed_for_ordered_set_udaf to test
percentile_cont_udaf() registration with the planner, ensuring
the planning of WITHIN GROUP invocation succeeds and
verifying that the resulting plan mentions percentile_cont.
Add a concise comment for explicit opt-in in the planner.
Extract argument-prepending logic into a helper method
to centralize the protocol.

Add tests to validate rejection of multiple ORDER BY items
inside WITHIN GROUP and the combination of USING WITHIN
GROUP with OVER.
@github-actions github-actions bot added documentation Improvements or additions to documentation sql SQL Planner development-process Related to development process of DataFusion sqllogictest SQL Logic Tests (.slt) labels Nov 11, 2025
Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask how much of this PR was LLM generated? I see a lot of hallmarks of LLM generated code.

Comment on lines 216 to 218
// Type alias to simplify very complex tuple return types used by helper
// functions in this module. Using a named alias reduces the type complexity
// shown in compiler errors and improves readability.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments feel overly verbose

Also having this type alias so far away from its only usage feels odd to me

Copy link
Contributor Author

@kosiew kosiew Nov 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the comment.
Moving the alias closer to fn extract_and_prepend_within_group_args generated this compiler error
error[E0658]: inherent associated types are unstable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a stable Rust release, this errors is resolved by moving the type alias outside of the impl block and making it a top-level type alias

Comment on lines 499 to 515
// Enforce explicit opt-in for WITHIN GROUP: UDAFs must advertise
// `supports_within_group_clause()` to accept WITHIN GROUP syntax.
// Previously the planner forwarded WITHIN GROUP to any
// order-sensitive aggregate; that was too permissive.
let supports_within_group = fm.supports_within_group_clause();

if !within_group.is_empty() && !supports_within_group {
return plan_err!(
"WITHIN GROUP is only supported for ordered-set aggregate functions"
);
}

// If the UDAF opted into WITHIN GROUP handling, convert the
// WITHIN GROUP ordering into sort expressions and prepend the
// ordering expressions to the function argument list (as unnamed
// args). This helper centralizes the argument-prepending protocol
// so it's easier to maintain and reason about.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto for these comments

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended to be more succinct.

Comment on lines 823 to 829
args: Vec<Expr>,
arg_names: Vec<Option<String>>,
schema: &DFSchema,
planner_context: &mut PlannerContext,
) -> Result<WithinGroupExtraction> {
let mut args = args;
let mut arg_names = arg_names;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
args: Vec<Expr>,
arg_names: Vec<Option<String>>,
schema: &DFSchema,
planner_context: &mut PlannerContext,
) -> Result<WithinGroupExtraction> {
let mut args = args;
let mut arg_names = arg_names;
mut args: Vec<Expr>,
mut arg_names: Vec<Option<String>>,
schema: &DFSchema,
planner_context: &mut PlannerContext,
) -> Result<WithinGroupExtraction> {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure adding these tests here are necessary; SLTs should be sufficient and I believe some of the tests added here are already in SLTs, e.g.

# Not supported over sliding windows
query error DataFusion error: Error during planning: OVER and WITHIN GROUP clause cannot be used together. OVER is for window functions, whereas WITHIN GROUP is for ordered set aggregate functions
SELECT approx_percentile_cont(0.5)
WITHIN GROUP (ORDER BY c3)
OVER (ROWS BETWEEN 4 PRECEDING AND CURRENT ROW)
FROM aggregate_test_100

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the tests that are already covered in slt.

Comment on lines +51 to +71
## WITHIN GROUP / Ordered-set aggregates

Some aggregate functions support the SQL `WITHIN GROUP (ORDER BY ...)` clause. This clause is used by
ordered-set aggregate functions (for example, percentile and rank-like aggregations) to specify the ordering
of inputs that the aggregate relies on. In DataFusion, only aggregate functions that explicitly opt into
ordered-set semantics via their implementation will accept `WITHIN GROUP`; attempting to use `WITHIN GROUP`
with a regular aggregate (for example `SUM(x) WITHIN GROUP (ORDER BY x)`) will fail during planning with an
error. This matches Postgres semantics and ensures ordered-set behavior is opt-in for user-defined aggregates.

Example (ordered-set aggregate):

```sql
percentile_cont(0.5) WITHIN GROUP (ORDER BY value)
```

Example (invalid usage — planner will error):

```sql
-- This will fail: SUM is not an ordered-set aggregate
SELECT SUM(x) WITHIN GROUP (ORDER BY x) FROM t;
```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide to add this to our documentation we should be explicit about which functions are ordered set aggregates. As it is, the user cannot tell by this alone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ordered-set aggregate functions (for example, percentile and rank-like aggregations) to specify the ordering
of inputs that the aggregate relies on. In DataFusion, only aggregate functions that explicitly opt into
ordered-set semantics via their implementation will accept WITHIN GROUP; attempting to use WITHIN GROUP

I thought this part gave a good indication of ordered set aggregate functions.

@kosiew
Copy link
Contributor Author

kosiew commented Nov 12, 2025

Can I ask how much of this PR was LLM generated? I see a lot of hallmarks of LLM generated code.

Thank you for your patient review and feedback.

image

I use VS Code with Copilot Pro, but I always review, adapt, and test everything before opening a PR.
The final code reflects my design and review decisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

development-process Related to development process of DataFusion documentation Improvements or additions to documentation sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

WITHIN GROUP needs to be more strict

2 participants