-
Notifications
You must be signed in to change notification settings - Fork 3.7k
branch-4.0: [feature](search) introduce lucene bool mode for search function #59394 #59745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: branch-4.0
Are you sure you want to change the base?
Conversation
) ### What problem does this PR solve? Issue Number: close #xxx Related PR: #58545 Problem Summary: This PR introduces two new features for the SEARCH function: #### 1. Lucene Boolean Mode Adds a `mode` option to enable Lucene/Elasticsearch-style query parsing: ```sql -- Enable Lucene mode via JSON options SELECT * FROM docs WHERE search('apple AND banana', '{"default_field":"title","mode":"lucene"}'); -- With minimum_should_match SELECT * FROM docs WHERE search('apple AND banana OR cherry', '{"default_field":"title","mode":"lucene","minimum_should_match":1}'); ``` **Key differences from standard mode:** - AND/OR/NOT work as left-to-right modifiers (not traditional boolean algebra) - Uses MUST/SHOULD/MUST_NOT internally (like Lucene's Occur enum) - Pure NOT queries return empty results (need positive clause) **Behavior comparison:** | Query | Standard Mode | Lucene Mode | |-------|--------------|-------------| | `a AND b` | a ∩ b | +a +b (both MUST) | | `a OR b` | a ∪ b | a b (both SHOULD, min=1) | | `NOT a` | ¬a | Empty (no positive clause) | | `a AND NOT b` | a ∩ ¬b | +a -b (MUST a, MUST_NOT b) | | `a AND b OR c` | (a ∩ b) ∪ c | +a b c (only a is MUST) | #### 2. Escape Characters in DSL Support for escaping special characters using backslash: | Escape | Description | Example | |--------|-------------|---------| | `\ ` | Literal space | `title:First\ Value` matches "First Value" | | `\(` `\)` | Literal parentheses | `title:hello\(world\)` matches "hello(world)" | | `\:` | Literal colon | `title:key\:value` matches "key:value" | | `\\` | Literal backslash | `title:path\\to\\file` matches "path\to\file" |
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
FE UT Coverage ReportIncrement line coverage |
7b764b8 to
3e3d159
Compare
|
run buildall |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
Cherry-picked from #59394
Note: This PR depends on #59766 (cherry-pick of #58545) being merged first.
Summary
Introduce lucene bool mode for search function.
Test plan
Related PRs: #59394
Depends on: #59766