Skip to content

[GH-3013] Box3D aggregate: ST_3DExtent#3015

Merged
jiayuasu merged 1 commit into
apache:masterfrom
jiayuasu:feature/box3d-extent
May 30, 2026
Merged

[GH-3013] Box3D aggregate: ST_3DExtent#3015
jiayuasu merged 1 commit into
apache:masterfrom
jiayuasu:feature/box3d-extent

Conversation

@jiayuasu
Copy link
Copy Markdown
Member

Did you read the Contributor Guide?

Is this PR related to a ticket?

  • Yes — closes #3013; completes the Box3D Phase 1 epic (#2973).

What changes were proposed in this PR?

Final slice of the Box3D Phase 1 epic. Mirrors PostGIS's ST_3DExtent and parallels Sedona's existing ST_Extent (which returns a Box2D).

  • New aggregator ST_3DExtent in AggregateFunctions.scala: folds a column of Geometry into a Box3D. Skips null and empty geometries, returns NULL when no rows contributed. Geometries without a Z dimension fold into z = 0 per coordinate, matching PostGIS.
  • Registered in Catalog.aggregateExpressions so SQL SELECT ST_3DExtent(geom) FROM ... resolves.
  • DataFrame API helpers: st_aggregates.ST_3DExtent(Column) / ST_3DExtent(String) (Scala) and sedona.spark.sql.st_aggregates.ST_3DExtent (Python).
  • Box3DExtentSuite: aggregation over mixed XY/XYZ rows, NULL on empty input, NULL-row skip.

Box3D Phase 1 is now complete across 5 PRs:

  1. [GH-2973] Box3D foundation: value class + UDT + Catalyst plumbing #2978 — foundation (value class + UDT + Catalyst plumbing)
  2. [GH-2973] Box3D constructors: ST_Box3D and ST_3DMakeBox #2984 — constructors (ST_Box3D, ST_3DMakeBox)
  3. [GH-2973] Box3D accessors + ST_AsText overload #3005 — accessors + ST_AsText overload
  4. [GH-3012] Box3D predicates: ST_3DBoxIntersects and ST_3DBoxContains #3014 — predicates (ST_3DBoxIntersects, ST_3DBoxContains)
  5. this PR — ST_3DExtent aggregate

Out of scope (separate follow-ups per the EPIC): Flink/R bindings, docs, GeoParquet covering interop, Box2D↔Box3D casts, spatial join planner integration, filter pushdown, ST_3DDWithin.

How was this patch tested?

  • New ScalaTest suite Box3DExtentSuite covering aggregation over mixed XY/XYZ rows, NULL on empty input, and NULL-row skip.
  • Added DataFrame API smoke tests in dataFrameAPITestScala (Scala) and test_dataframe_api.py (Python). All pass locally.

Did this PR include necessary documentation updates?

Final slice of the Box3D Phase 1 epic. Mirrors PostGIS's ST_3DExtent
and parallels Sedona's existing ST_Extent (which returns a Box2D).

- `Envelope3DBuffer` case class: aggregator buffer with six doubles +
  merge logic. Spark Encoders use it because JTS doesn't have a 3D
  envelope analog and Box3D itself isn't a Spark Encoder-friendly
  Product type.
- `ST_3DExtent` aggregator: Aggregator[Geometry, Option[Envelope3DBuffer],
  Box3D]. Skips null and empty geometries, returns null when no rows
  contributed. Geometries without a Z dimension fold into z=0
  per-coordinate, matching PostGIS.
- Registered in `Catalog.aggregateExpressions` so SQL `SELECT
  ST_3DExtent(geom) FROM ...` resolves.
- `Box3DExtentSuite`: aggregation over mixed XY/XYZ rows, NULL on empty
  input, NULL-row skip.

Box3D Phase 1 is now complete across 5 PRs (foundation, constructors,
accessors + AsText, predicates, this aggregate).
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the ST_3DExtent SQL aggregate (PostGIS-compatible) to compute a Box3D extent over a geometry column in Sedona Spark SQL, including DataFrame helpers and cross-language (Scala/Python) test coverage.

Changes:

  • Implemented ST_3DExtent aggregator in Spark SQL expressions and registered it in the Sedona SQL Catalog.
  • Added Scala DataFrame API wrappers in st_aggregates for ST_3DExtent(Column|String).
  • Added end-to-end tests for SQL/DataFrame usage in Scala and Python.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file
File Description
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/AggregateFunctions.scala Implements the ST_3DExtent aggregator and its buffer type, producing Box3D results and skipping null/empty geometries.
spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/st_aggregates.scala Exposes Scala DataFrame helper methods for calling the new aggregate.
spark/common/src/main/scala/org/apache/sedona/sql/UDF/Catalog.scala Registers the new aggregate so it resolves in SQL (ST_3DExtent).
spark/common/src/test/scala/org/apache/sedona/sql/Box3DExtentSuite.scala Adds SQL-level ScalaTest coverage for aggregation semantics (mixed XY/XYZ, empty input, null/empty row skipping).
spark/common/src/test/scala/org/apache/sedona/sql/dataFrameAPITestScala.scala Adds a Scala DataFrame API smoke test for ST_3DExtent.
python/sedona/spark/sql/st_aggregates.py Adds the Python DataFrame API wrapper for ST_3DExtent.
python/tests/sql/test_dataframe_api.py Adds a Python DataFrame API test case for ST_3DExtent (string-cast comparison).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jiayuasu jiayuasu linked an issue May 30, 2026 that may be closed by this pull request
@jiayuasu jiayuasu added this to the sedona-1.9.1 milestone May 30, 2026
@jiayuasu jiayuasu merged commit 2e3b3a8 into apache:master May 30, 2026
44 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Box3D aggregate: ST_3DExtent

2 participants