preserve trailing word boundaries #1545

missinglink · 2021-07-20T13:44:20Z

this PR resolves the bug described in #1544

missinglink · 2021-07-20T13:59:21Z

this issue appears to be more complex than I had hoped, it would take some more time to investigate:

whether clean.text is producing the correct value
whether _sanitizer/_tokinizer.js is marking all tokens as 'complete' when the text has extra chars at the end
whether the parser is receiving what it needs and returning the correct thing
what effect (if any) changes would have on /v1/search

missinglink · 2021-07-20T14:02:08Z

note: there is already some code which is fairly similar looking but subtly different from the changes in this PR:

api/sanitizer/_tokenizer.js

Lines 41 to 45 in e6179c5

    
           // when $subject is not the end of $clean.text 
        
           // then there must be tokens coming afterwards 
        
           else if (!clean.text.endsWith(text)) { 
        
             parserConsumedAllTokens = true; 
        
           }

fix(sanitizer): preserve trailing word boundaries

35eb57d

missinglink marked this pull request as draft July 20, 2021 13:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

preserve trailing word boundaries #1545

preserve trailing word boundaries #1545

Uh oh!

missinglink commented Jul 20, 2021

Uh oh!

missinglink commented Jul 20, 2021 •

edited

Loading

Uh oh!

missinglink commented Jul 20, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

preserve trailing word boundaries #1545

Are you sure you want to change the base?

preserve trailing word boundaries #1545

Uh oh!

Conversation

missinglink commented Jul 20, 2021

Uh oh!

missinglink commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

missinglink commented Jul 20, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

missinglink commented Jul 20, 2021 •

edited

Loading