perf: implement bitpack encoding for LID and MID blocks#328
Open
perf: implement bitpack encoding for LID and MID blocks#328
Conversation
3accfd6 to
53ef5eb
Compare
Member
Author
|
@seqbenchbot up main bulk |
Contributor
🔴 Performance DegradationSome benchmarks have degraded compared to the previous run. Show table
|
Member
Author
|
@seqbenchbot down dc2d4d40 |
|
Nice, @cheb0 The benchmark with identificator Show summary
Have a great time! |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
bf8f3d8 to
985f11f
Compare
Contributor
🔴 Performance DegradationSome benchmarks have degraded compared to the previous run. Show table
|
985f11f to
4c802d2
Compare
Contributor
🔴 Performance DegradationSome benchmarks have degraded compared to the previous run. Show table
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Replaces varint encoding with faster delta bitpacking. Both LID and MID blocks now use bitpack. Currently, intcomp library is used. The lib doesn't utilize SIMD, so we might update to something else in future.
Measurements
compression
bitpack compresses a lot better: for varints zstd compresses with ratio ~1.7-2.0 while it only compresses delta bitpacked data with ratio ~1.3. Therefore we potentially can disable zstd on benchmarks with a slight dataset size overhead.
dataset size
Overall, we reach approximately same dataset size. For some envs there is a small benefit of around -3% of total dataset.
search latency (prod fractions)
Usually, cold search request are affected. I measured search latency on a single repacked fraction. For example,
message:"XYZ" AND NOT k8s_service_name:"ABC" AND NOT request_host:"google.com" AND NOT cluster_name:"zxc"12 ms => 8 ms (cold)
For aggregations the benefit is lower simply because there is more CPU work. It would have higher benefit if
service:xyz group by k8s_pod130 ms => 110 ms (cold)
Overall, the perf improvement is around 5-30% for cold search requests depending on a particular search request.
search latency (logbench)
TODO I will measure it on my PC, since network disk makes measurements of cold search request problematic (results vary a lot).
Fixes #312