Change Utf8.encodedLength to just encode and check length on server (unsafeprocessor) #24363
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Change Utf8.encodedLength to just encode and check length on server (unsafeprocessor)
Maintain the current loop behavior on mobile (safeprocessor).
Encoding conceptually does a lot more work (both computationally and an allocation) than needed to simply determine how many bytes it should take to encode in Utf8 string. However, JDK has privilege to access string internals as byte[], which enables it to implement the getBytes() method including encoding faster than any other way to determine the byte length that we are able to write via read loop.
Several alternatives were benchmarked, including various alternate loops, other JDK APIs, but this version benchmarks as 10x faster on ascii strings, and 2x faster on most latin1 and higher unicode codepoint strings (with some regression cases for mysterious reasons), the second best implementations benchmarked is what we have today, and other alternatives were slower than that.