Skip to content

Add chunking edge case tests#481

Open
lxingy3 wants to merge 1 commit into
google:mainfrom
lxingy3:add-chunking-edge-tests
Open

Add chunking edge case tests#481
lxingy3 wants to merge 1 commit into
google:mainfrom
lxingy3:add-chunking-edge-tests

Conversation

@lxingy3

@lxingy3 lxingy3 commented Jun 17, 2026

Copy link
Copy Markdown

Description

Adds focused test coverage for chunking edge cases documented in #430:

  • SentenceIterator start-position guards
  • token interval helper error paths
  • ChunkIterator constructor fallbacks
  • TextChunk missing-document errors, sanitized text, and lazy caching
  • batching with a short batch length
  • broken_sentence reset behavior after a sentence is split mid-way

Fixes #430

Testing

How Has This Been Tested?

.venv\Scripts\python.exe -m pytest tests\chunking_test.py

Checklist:

  • I have read and acknowledged Google's Open Source
    Code of conduct.
  • I have read the
    Contributing
    page, and I either signed the Google
    Individual CLA
    or am covered by my company's
    Corporate CLA.
  • I have discussed my proposed solution with code owners in the linked
    issue(s) and we have agreed upon the general approach.
  • I have made any needed documentation changes, or noted in the linked
    issue(s) that documentation elsewhere needs updating.
  • I have added tests, or I have ensured existing tests cover the changes
  • I have followed
    Google's Python Style Guide
    and ran pylint over the affected code.

@github-actions github-actions Bot added the size/M Pull request with 150-600 lines changed label Jun 17, 2026
@google-cla

google-cla Bot commented Jun 17, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@lxingy3 lxingy3 force-pushed the add-chunking-edge-tests branch from 2269611 to 1a6c12b Compare June 17, 2026 12:15
@lxingy3

lxingy3 commented Jun 28, 2026

Copy link
Copy Markdown
Author

recheck

@lxingy3

lxingy3 commented Jun 28, 2026

Copy link
Copy Markdown
Author

recheck

@lxingy3 lxingy3 force-pushed the add-chunking-edge-tests branch from 1a6c12b to 7068373 Compare June 28, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Pull request with 150-600 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test: Add missing test coverage for chunking module edge cases

1 participant