[GH-3013] Box3D aggregate: ST_3DExtent#3015
Merged
Merged
Conversation
Final slice of the Box3D Phase 1 epic. Mirrors PostGIS's ST_3DExtent and parallels Sedona's existing ST_Extent (which returns a Box2D). - `Envelope3DBuffer` case class: aggregator buffer with six doubles + merge logic. Spark Encoders use it because JTS doesn't have a 3D envelope analog and Box3D itself isn't a Spark Encoder-friendly Product type. - `ST_3DExtent` aggregator: Aggregator[Geometry, Option[Envelope3DBuffer], Box3D]. Skips null and empty geometries, returns null when no rows contributed. Geometries without a Z dimension fold into z=0 per-coordinate, matching PostGIS. - Registered in `Catalog.aggregateExpressions` so SQL `SELECT ST_3DExtent(geom) FROM ...` resolves. - `Box3DExtentSuite`: aggregation over mixed XY/XYZ rows, NULL on empty input, NULL-row skip. Box3D Phase 1 is now complete across 5 PRs (foundation, constructors, accessors + AsText, predicates, this aggregate).
Contributor
There was a problem hiding this comment.
Pull request overview
Adds the ST_3DExtent SQL aggregate (PostGIS-compatible) to compute a Box3D extent over a geometry column in Sedona Spark SQL, including DataFrame helpers and cross-language (Scala/Python) test coverage.
Changes:
- Implemented
ST_3DExtentaggregator in Spark SQL expressions and registered it in the Sedona SQLCatalog. - Added Scala DataFrame API wrappers in
st_aggregatesforST_3DExtent(Column|String). - Added end-to-end tests for SQL/DataFrame usage in Scala and Python.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/AggregateFunctions.scala | Implements the ST_3DExtent aggregator and its buffer type, producing Box3D results and skipping null/empty geometries. |
| spark/common/src/main/scala/org/apache/spark/sql/sedona_sql/expressions/st_aggregates.scala | Exposes Scala DataFrame helper methods for calling the new aggregate. |
| spark/common/src/main/scala/org/apache/sedona/sql/UDF/Catalog.scala | Registers the new aggregate so it resolves in SQL (ST_3DExtent). |
| spark/common/src/test/scala/org/apache/sedona/sql/Box3DExtentSuite.scala | Adds SQL-level ScalaTest coverage for aggregation semantics (mixed XY/XYZ, empty input, null/empty row skipping). |
| spark/common/src/test/scala/org/apache/sedona/sql/dataFrameAPITestScala.scala | Adds a Scala DataFrame API smoke test for ST_3DExtent. |
| python/sedona/spark/sql/st_aggregates.py | Adds the Python DataFrame API wrapper for ST_3DExtent. |
| python/tests/sql/test_dataframe_api.py | Adds a Python DataFrame API test case for ST_3DExtent (string-cast comparison). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Did you read the Contributor Guide?
Is this PR related to a ticket?
What changes were proposed in this PR?
Final slice of the Box3D Phase 1 epic. Mirrors PostGIS's
ST_3DExtentand parallels Sedona's existingST_Extent(which returns a Box2D).ST_3DExtentinAggregateFunctions.scala: folds a column of Geometry into a Box3D. Skips null and empty geometries, returns NULL when no rows contributed. Geometries without a Z dimension fold intoz = 0per coordinate, matching PostGIS.Catalog.aggregateExpressionsso SQLSELECT ST_3DExtent(geom) FROM ...resolves.st_aggregates.ST_3DExtent(Column)/ST_3DExtent(String)(Scala) andsedona.spark.sql.st_aggregates.ST_3DExtent(Python).Box3DExtentSuite: aggregation over mixed XY/XYZ rows, NULL on empty input, NULL-row skip.Box3D Phase 1 is now complete across 5 PRs:
ST_Box3D,ST_3DMakeBox)ST_AsTextoverloadST_3DBoxIntersects,ST_3DBoxContains)ST_3DExtentaggregateOut of scope (separate follow-ups per the EPIC): Flink/R bindings, docs, GeoParquet covering interop, Box2D↔Box3D casts, spatial join planner integration, filter pushdown,
ST_3DDWithin.How was this patch tested?
Box3DExtentSuitecovering aggregation over mixed XY/XYZ rows, NULL on empty input, and NULL-row skip.dataFrameAPITestScala(Scala) andtest_dataframe_api.py(Python). All pass locally.Did this PR include necessary documentation updates?