Skip to content

Conversation

@TeddyLiang01
Copy link

full scene coverage by including partial tiles and the right and top edges during tiling

@PatBall1
Copy link
Owner

PatBall1 commented Nov 3, 2025

Thanks for this @TeddyLiang01 - are you able to check why it is failing to build?

@TeddyLiang01
Copy link
Author

The Cocoapi fork can't see cython that was installed. Is there any particular reason why we are using that fork? Also I didn't change dependencies so I'm not sure why it's failing to build when it worked before.

@PatBall1
Copy link
Owner

PatBall1 commented Nov 3, 2025

@ChristopherKotthoff was this something you set?

@PatBall1
Copy link
Owner

See #211

@TeddyLiang01 can you give a bit more detail about how you solved this issue? Was there any need to handle training tiling and prediction tiling differently given the different requirements?

@TeddyLiang01
Copy link
Author

@PatBall1
The original version of this tiling function uses np.arange grid generation that calculates a regular tile grid, outputting an array of starting x and y coordinates of each tile (the bottom-left point of the tile). However, because np.arange stops before the upper bound, the old code can stop at the last tile start that does not overrun the top/right boundaries, leaving a gap/strip near the edges when the stride does not divide the image extent exactly.

The revised code uses essentially the same logic as the original, same grid, same tile size. However, after generating the base grid using range and stopping at the same point the original code did, the code now checks whether the tiles reach the top/right boundary. If not, then it will append x_last and/or y_last, which are the starting positions (bottom-left point of a tile) that place a full tile flush with the boundaries while remaining within them.

In terms of the changes to training and prediction, for training, all tiles are still full size and within bounds, so training will continue exactly as before but with fuller coverage near the boundaries, adding at most one extra column and/or row of tiles in the x and y directions. For prediction, this is what this change was mainly trying to fix. The change keeps everything the same but adds at most one extra tile column and/or row that overlaps slightly with the previous tiles while still being full size, ensuring full coverage at the boundaries. Detectree2’s stitching and post-processing already handle overlapping tiles, so this change does not require any modifications to later processing.

@PatBall1
Copy link
Owner

Possibly some issues -- the changes make the grid generation more explicit and often more correct, but they also change behavior in corner cases (small images, negative coordinates, odd tile sizes) and could introduce tiles that extend past image bounds or duplicate starts. I recommend testing the cases listed below and (depending on desired behavior) tightening a couple of checks.

Detailed points and recommendations

  1. Integer rounding of inputs
  • You force buffer, tile_width, tile_height to int(round(...)). That’s reasonable for pixel-based tiling but will change behavior if callers rely on fractional values. Confirm callers expect integer steps.
  1. Inclusive/exclusive and floor/ceil differences
  • You changed the stop calculation from int(data.bounds[2] - tile_width - buffer) to int(math.floor(data.bounds[2]) - tile_width - buffer) and you now create ranges with range(x0, x_last+1, tile_width) (inclusive logic).
  • For positive coords this is equivalent in most cases, but for negative bounds int(...) vs math.floor(...) differ (int(-3.1) -> -3, floor -> -4). That will shift starts for datasets with negative coordinates. If you want consistent geometric behavior prefer one and document it; floor is usually safer for spatial bounds.
  1. Small images / insufficient room for a full tile
  • Original np.arange would return empty when there wasn’t room for a full tile (no starts). New code sets x_starts = [x0] (and similarly for y) when x_last < x0, which will create a tile starting at x0 even if tile_width exceeds the image width (so tile will extend beyond bounds).
  • This is a behavior change. If you intended to always produce at least one tile (possibly clipped later), that’s OK. If you preferred “no tiles” for too-small images, change this logic back or clamp the start so the tile fits:
    • Example safe alternative to guarantee a tile fits:
      x0 = int(math.ceil(data.bounds[0])) + buffer
      x_last = int(math.floor(data.bounds[2]) - tile_width - buffer)
      if x_last < x0:
      x0 = max(int(math.floor(data.bounds[2]) - tile_width - buffer), x0) # or set x0 = int(math.floor(data.bounds[2]) - tile_width - buffer)
      # or explicitly produce no tiles if you prefer
  1. Half-shifted (overlapping) grid calculation
  • You compute x0_h = x0 + tile_width // 2 and x_last_h = x_last + tile_width // 2. That may be fine, but:
    • Using integer //2 for odd tile widths will shift by floor(tile_width/2); consider whether you want exact half-pixel shifts (probably not necessary).
    • The formula x_last_h = x_last + tile_width//2 is derived from x_last for the non-shifted grid; it may produce starts that still let tiles extend past the image in the half-shift case or not fully use the available space. It’s safer to recompute x_last_h from bounds similarly to x_last:
      x0_h = int(math.ceil(bounds_left)) + buffer + tile_width//2
      x_last_h = int(math.floor(bounds_right) - tile_width - buffer) # then proceed the same way
  • Also check the conditions used to set x_starts_h / y_starts_h — you have slightly different guard logic (if not x_starts_h and x_last_h >= x0_h then set [x0_h]) vs original. Ensure consistent semantics.
  1. Potential duplicates
  • If you append the half-shifted grid, some coordinates (edges) might be included twice. If duplicate starts are a problem, deduplicate:
    coordinates = list(dict.fromkeys(grid_coords)) # preserves order
    or use a set to check before append.
  1. Off-by-one and inclusivity
  • Using range(..., x_last + 1, tile_width) ensures x_last is considered. Confirm that x_last was intended to be an inclusive start. With the previous np.arange(stop) semantics the stop was exclusive; this subtle difference can add an extra start at the image edge compared to the old code.
  1. Tests to run
  • Image smaller than tile (tile_width > image width) with overlapping_tiles False/True.
  • Negative coordinate bounds (images with negative geotransform origin).
  • Odd tile sizes (e.g., 31) to check half-shift behavior.
  • Cases where tiles exactly tile the image (no remainder) and cases with remainder.
  • Verify there are no duplicated coordinates and that downstream code handles tiles that extend outside bounds (or clamp them if desired).

Actionable suggestions

  • Decide whether you want “at least one tile even if it extends beyond bounds” or “no tiles for too-small images” and make the x_starts/y_starts fallback reflect that.
  • Recompute half-shift x_last_h / y_last_h from bounds instead of reusing x_last +/- tile_width//2 to avoid incorrect end conditions.
  • Consider deduplicating grid_coords when combining base and half-shifted grids.
  • Add unit tests for the edge cases above.

@PatBall1
Copy link
Owner

@TeddyLiang01 let me know what you think. We could have a complete parameter that forces the last columns/row to be included with an overlap with the existing grid?

Me: Would it be possible say, to keep the original discarding of incomplete tiles to avoid any issues BUT then to cover the final incomplete row/column start a tiling that is from the opposite corner of the raster from the initial starting point to cover just that last column row (accepting some overlap with the original tiles)?

Copilot:
Yes — that pattern is possible and straightforward: keep the base grid generation that only emits full tiles (no “force include last start”), then add a short complementary tiling that begins at the opposite edge and only supplies the one (or few) starts needed to cover the final incomplete column and/or row. Those complementary starts will overlap the last full-column/row from the base grid but will avoid producing any tiles that extend past the raster.

Below is a ready-to-drop-in implementation you can paste in place of the current "grid" block. It:

  • keeps the base behavior of only emitting full tiles (no auto-append of the last start),
  • computes an "opposite" start for the final column and/or final row when needed and adds tiles to cover those strips,
  • preserves order and deduplicates coordinates,
  • also applies the same opposite logic to the half-shifted (overlapping_tiles) grid.

Paste this inside the tile_placement == "grid" block (replacing the current grid-generation logic):

# Ensure integer pixel/metre steps for grid generation
buffer = int(round(buffer))
tile_width = max(1, int(round(tile_width)))
tile_height = max(1, int(round(tile_height)))

coordinates: List[Tuple[int, int]] = []
if tile_placement == "grid":
    with rasterio.open(img_path) as data:
        left, bottom, right, top = data.bounds

        # Inclusive start positions (rounded to integer pixel coordinates)
        x0 = int(math.ceil(left)) + buffer
        y0 = int(math.ceil(bottom)) + buffer

        # Last valid start that keeps a whole tile inside the bounds
        x_last = int(math.floor(right)) - tile_width - buffer
        y_last = int(math.floor(top)) - tile_height - buffer

        def make_regular_starts(start: int, last: int, step: int) -> List[int]:
            """Return a regular stepping list of starts that keep whole tiles inside bounds.
               DO NOT add a final 'last' start that isn't on the regular step (preserves
               the original 'discard incomplete' behaviour)."""
            if last >= start:
                return list(range(start, last + 1, step))
            return []

        # Base (left-to-right / bottom-to-top) starts - only full tiles on the regular grid
        x_starts = make_regular_starts(x0, x_last, tile_width)
        y_starts = make_regular_starts(y0, y_last, tile_height)

        grid_coords: List[Tuple[int, int]] = []
        # base grid
        grid_coords.extend([(int(minx), int(miny)) for minx in x_starts for miny in y_starts])

        # Complementary tiling from the opposite edge to cover the *final* incomplete column/row:
        # If the regular left-start grid didn't include the rightmost valid start (x_last),
        # add that one start (cover last column). Same for y.
        x_opposite = []
        if x_last >= x0 and (x_last not in x_starts):
            x_opposite = [x_last]
        y_opposite = []
        if y_last >= y0 and (y_last not in y_starts):
            y_opposite = [y_last]

        # Add the final column (opposite x) across the base rows (if any)
        if x_opposite and y_starts:
            grid_coords.extend([(int(x_opposite[0]), int(y)) for y in y_starts])

        # Add the final row (opposite y) across the base columns (if any)
        if y_opposite and x_starts:
            grid_coords.extend([(int(x), int(y_opposite[0])) for x in x_starts])

        # Add the corner tile (opposite x and opposite y) if both are needed
        if x_opposite and y_opposite:
            grid_coords.append((int(x_opposite[0]), int(y_opposite[0])))

        # Overlapping (half-shifted) grid: same approach but with half-shifted starts
        if overlapping_tiles:
            x0_h = x0 + (tile_width // 2)
            y0_h = y0 + (tile_height // 2)

            x_starts_h = make_regular_starts(x0_h, x_last, tile_width)
            y_starts_h = make_regular_starts(y0_h, y_last, tile_height)

            grid_coords.extend([(int(minx), int(miny)) for minx in x_starts_h for miny in y_starts_h])

            # Opposite for half-shift: same rule - add right/topmost start if it wasn't on the half-shift stepping
            if x_last >= x0_h and (x_last not in x_starts_h) and y_starts_h:
                grid_coords.extend([(int(x_last), int(y)) for y in y_starts_h])
            if y_last >= y0_h and (y_last not in y_starts_h) and x_starts_h:
                grid_coords.extend([(int(x), int(y_last)) for x in x_starts_h])
            if (x_last >= x0_h and (x_last not in x_starts_h)) and (y_last >= y0_h and (y_last not in y_starts_h)):
                grid_coords.append((int(x_last), int(y_last)))

        # Deduplicate while preserving order
        seen = set()
        unique_coords = []
        for c in grid_coords:
            if c not in seen:
                seen.add(c)
                unique_coords.append(c)

        coordinates = unique_coords

Why this approach

  • The base grid still only emits starts that produce full tiles on the regular stepping (so you keep the original safe behaviour).
  • The complementary "opposite-edge" starts are constrained to full-tile starts (x_last/y_last) so they never create partial tiles beyond the raster. They only ensure the trailing strip at the far edge is covered by starting at the far edge and stepping backward by tile_width (we add only the boundary start, so overlap is limited to the adjacent column/row).
  • Deduplication prevents edge duplicates if base and opposite grids align.

Tests to run

  • A raster whose width (usable area after buffer) leaves a remainder (e.g., usable_width = 3*tile_width + remainder) — check the last column is covered by the opposite-start tiles and overlap is as expected.
  • Raster that tiles exactly — no opposite starts should be added.
  • Very small rasters where no full tile fits — no starts should be emitted.
  • Odd tile sizes (e.g., 31) and negative bounds.
  • overlapping_tiles True/False cases to confirm half-shifted complement is behaving correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants