-
Notifications
You must be signed in to change notification settings - Fork 153
Allow full unfiltering for partial data #640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
This relaxes the constraints that the regions between available..filled and the actual rows that are worked on do not overlap. Instead, different methods are used depending which part is worked on. The obligation of avoiding a mutation of the back reference buffer is on the caller to decide (we have debug_asserts in place) which had already been the case since the usual unfilter_row method would not verify that enough space was available. There's a few points that offer design decisions: - Previously we had the implicit invariant that the zlib window must start at `0` if it was not full yet, since the decoder determines the back-reference range by subtracting the maximum possible window. Without any compensation we can not place a mutable row region at the start of the data buffer before it is full, otherwise we would change the read-only window *backwards* violating invariants. Since a window that is too big isn't a problem we could ignore that and move data buffer by a large amount backwards if we establish that region. Alternatively, any commit to the `available` field can enforce that it is only updated forwards (i.e. it only moves back with the data). - The `prev_row` index was used to determine the amount of available shift back but it can now be within the region if an interlace reset occurs while we do read-ahead of rows (we still use the check `prev_row == current_row` to distinguish the first row of each interlace pass). So the shift back is changed to use the minimum of `prev_row` and `available` which is equivalent to the previous for a shift back after `reset` in all existing cases but also works if we shift back after having read into the deflate buffer region.
|
Performance seems to have been an artifact of the known problems llvm. Here's the results from Rust |
|
I feel uneasy about the fuzzer results. On the one hand, the kind of finding is just confirming the known bug from #605 and that's been around for a bit—we just now got it into the corpus at sufficiently shallow depth for CI to stumble onto it. Here's a file that even reproduces it on the main branch: On the other hand it could mask some issues with this PR. |
anforowicz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR. Just a few drive-by comments from somebody who is not a maintainer (so take with a grain of salt).
I feel uneasy about the fuzzer results.
I wonder how we could increase our confidence that the new code works correctly.
I note that the new test verifies that the first 10 rows decode successfully, but it does not verify the actual pixels of those 10 rows. Maybe we could add a test that chops off the input at a random location and verifies that if some X initial rows can be decoded, then the resulting pixels are the same as in the X initial rows of the untruncated/unchopped input.
| ))?; | ||
|
|
||
| // Unfilter this in a temporary buffer. | ||
| let mut row = row[1..rowlen].to_vec(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: IIUC to_vec will always allocate here. Would it be possible to avoid doing such an allocation for ~every row in a partially read input? Can we allocate once and then reuse the allocation?
| let end_of_new_prev = start_of_new_prev + padding; | ||
|
|
||
| // Shift everything up as required. | ||
| let current_shift = end_of_new_prev - potential_end_of_new_prev; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if instead of reusing/shifting self.data_stream it may be a bit easier/clearer if a separate buffer (or two) would be introduced as a field of UnfilteringBuffer? Say copied_curr_row: Vec<u8> and copied_prev_row: Option<Vec<u8>> or something like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and if we do that, then I guess the type of the current_start and prev_start fields would need to change from usize to something like Either<usize> (where Either::Left means the old behavior - data is in data_stream and where Either::Right means that the rows are in the new buffers).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type change in current_start was the main concern that drove me away from that concept. I wanted to keep influence on the main path as low as possible, having an Either would be an extra match / conditional read on every single path accessing current_start—even the happy one. I can benchmark it but intuitively, it will be quite visible.
|
I think there might be a complication with this approach: This only applies until 32768 bytes of uncompressed output have been produced, at which point all possible distances are valid. |
This doesn't seem like a hard blocker - we just need to ensure that Line 120 in 14496f2
Does that sound reasonable? |
Closes: #639
With the in-place unfilter buffer we forced unfiltering to trail behind the deflate window whereas previously it could copy data that was still used for back-reference and filter on it. That resulted in better performance but for partial images we lost out on 32kB of available image data. This relaxes the
UnfilteringBufferto allow both: use in-place for the common path but add a fallback for unfiltering the back reference portion of the data when we hitUnexpectedEof.Performance with this patch is mostly good but I'm hitting some erratic, weird fluctuations on the noncompressed tests (slower) and paletted-zune (faster) and Exoplanet_Phase_Curve_(Diagram).png (also much faster??). I'd very much like to understand the individual causes here given that we do just two extra bounds checks and an integer-min that shouldn't be in extremely tight loops.