Commit 51211df
authored
fix: Split large sels into multiple batches in CompactBatchs.Union - hotfix (#22841)
### **User description**
## What type of PR is this?
- [ ] API-change
- [x] BUG
- [ ] Improvement
- [ ] Documentation
- [ ] Feature
- [ ] Test and CI
- [ ] Code Refactoring
## Which issue(s) this PR fixes:
issue #22825
## What this PR does / why we need it:
This PR fixes a bug in `CompactBatchs.Union` method where large
selection arrays (`sels`) exceeding `batchMaxRow` were not properly
split into multiple batches when the batch collection was empty
(`bats.Length() == 0`).
**Problem:**
In the `Union` method, when `bats.Length() == 0` and `sels` length
exceeds `batchMaxRow`, the original implementation would create a single
batch containing all selected rows without checking the batch size
limit. This violates the `batchMaxRow` constraint that each batch should
respect, potentially causing:
- Batches exceeding the maximum row limit
- Inconsistent behavior compared to the case when `bats.Length() != 0`
(which already handles large `sels` correctly by splitting them)
**Solution:**
Modified the `Union` method to handle large `sels` arrays when
`bats.Length() == 0` by splitting them into multiple batches, ensuring
each batch respects the `batchMaxRow` limit. The fix iteratively
processes `sels` in chunks of `batchMaxRow` size, creating multiple
batches as needed.
**Changes:**
- Updated `pkg/container/batch/compact_batchs.go`: Added logic in the
`bats.Length() == 0` branch to split large `sels` into multiple batches,
making it consistent with the existing logic for `bats.Length() != 0`
case
- Added comprehensive test cases in
`pkg/container/batch/compact_batchs_test.go`:
`TestCompactBatchsUnionLargeSels` covers various scenarios:
- Union with `selsLen > batchMaxRow` when `bats.Length() == 0`
- Union with `selsLen == batchMaxRow` when `bats.Length() == 0`
- Union with large `sels` when last batch is already full
- Union with large `sels` when last batch has some rows
This fix ensures consistent batch size handling across all code paths in
the `Union` method, preventing potential issues with oversized batches.
___
### **PR Type**
Bug fix, Tests
___
### **Description**
- Split large selection arrays into multiple batches when empty
- Ensures each batch respects batchMaxRow limit constraint
- Added comprehensive test coverage for large selections
- Fixes inconsistent behavior between empty and non-empty batch
collections
___
### Diagram Walkthrough
```mermaid
flowchart LR
A["Union method called<br/>with large sels"] --> B{"bats.Length() == 0?"}
B -->|Yes| C["Split sels into<br/>batchMaxRow chunks"]
C --> D["Create multiple<br/>batches iteratively"]
D --> E["Each batch respects<br/>batchMaxRow limit"]
B -->|No| F["Existing logic<br/>handles splitting"]
E --> G["Consistent behavior<br/>across all paths"]
F --> G
```
<details> <summary><h3> File Walkthrough</h3></summary>
<table><thead><tr><th></th><th align="left">Relevant
files</th></tr></thead><tbody><tr><td><strong>Bug
fix</strong></td><td><table>
<tr>
<td>
<details>
<summary><strong>compact_batchs.go</strong><dd><code>Implement batch
splitting for large selections</code>
</dd></summary>
<hr>
pkg/container/batch/compact_batchs.go
<ul><li>Wrapped large selection handling in a loop to process selections
in <br>chunks<br> <li> Each chunk is limited to <code>batchMaxRow</code>
size<br> <li> Creates multiple batches iteratively instead of single
oversized batch<br> <li> Maintains consistency with existing logic for
non-empty batch <br>collections</ul>
</details>
</td>
<td><a
href="https://github.com/matrixorigin/matrixone/pull/22841/files#diff-13a01ffd2c00c5e8faff10e905d8067f400bafe3d940f6da252774850e0bfc85">+18/-8</a>
</td>
</tr>
</table></td></tr><tr><td><strong>Tests</strong></td><td><table>
<tr>
<td>
<details>
<summary><strong>compact_batchs_test.go</strong><dd><code>Add
comprehensive tests for large selection splitting</code>
</dd></summary>
<hr>
pkg/container/batch/compact_batchs_test.go
<ul><li>Added <code>TestCompactBatchsUnionLargeSels</code> test function
with four <br>comprehensive test cases<br> <li> Test case 1: Validates
splitting when selsLen > batchMaxRow on empty <br>collection<br> <li>
Test case 2: Validates single batch creation when selsLen ==
<br>batchMaxRow<br> <li> Test case 3: Validates new batch creation when
last batch is full<br> <li> Test case 4: Validates batch filling and
overflow when last batch has <br>partial rows</ul>
</details>
</td>
<td><a
href="https://github.com/matrixorigin/matrixone/pull/22841/files#diff-b568feda7c727d7a72c52ec6ebd3a44ed0676127e33093f4d66ddc1c784b2471">+116/-0</a>
</td>
</tr>
</table></td></tr></tr></tbody></table>
</details>
___1 parent a8f97af commit 51211df
File tree
2 files changed
+134
-8
lines changed- pkg/container/batch
2 files changed
+134
-8
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
166 | 166 | | |
167 | 167 | | |
168 | 168 | | |
169 | | - | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
175 | 175 | | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
176 | 188 | | |
177 | | - | |
178 | | - | |
179 | 189 | | |
180 | 190 | | |
181 | 191 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
232 | 232 | | |
233 | 233 | | |
234 | 234 | | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
| 264 | + | |
| 265 | + | |
| 266 | + | |
| 267 | + | |
| 268 | + | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
235 | 351 | | |
236 | 352 | | |
237 | 353 | | |
| |||
0 commit comments