Make image hashing deterministic and format/DPI-agnostic (#4116)#4432
Make image hashing deterministic and format/DPI-agnostic (#4116)#4432benello wants to merge 3 commits intoFlow-Launcher:devfrom
Conversation
📝 WalkthroughWalkthroughReplaces encoder-based hashing with deterministic raw-pixel hashing. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs (2)
26-38: RedundantFormatConvertedBitmapstep —RenderTargetBitmapalready normalizes the format.The destination
RenderTargetBitmapon line 41 is created withPixelFormats.Pbgra32, andDrawingContext.DrawImagewill render anyBitmapSourceinto that target regardless of the source format. The explicitFormatConvertedBitmapconversion is unnecessary work (and an extra allocation) per call.♻️ Proposed simplification
try { - // Normalize to a deterministic pixel format (32-bit premultiplied BGRA) - BitmapSource normalized = image; - if (normalized.Format != PixelFormats.Pbgra32) - { - var converted = new FormatConvertedBitmap(); - converted.BeginInit(); - converted.Source = normalized; - converted.DestinationFormat = PixelFormats.Pbgra32; - converted.EndInit(); - converted.Freeze(); - - normalized = converted; - } - - // Draw into a fixed-size RenderTarget at 96 DPI to ensure deterministic pixel buffer + // Draw into a fixed-size Pbgra32 RenderTarget at 96 DPI to ensure deterministic pixel buffer var rtb = new RenderTargetBitmap(HashSize, HashSize, 96, 96, PixelFormats.Pbgra32);(adjust subsequent references from
normalizedtoimage)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs` around lines 26 - 38, The FormatConvertedBitmap conversion block (creating `converted`, calling BeginInit/EndInit/Freeze and assigning to `normalized`) is redundant because the subsequent `RenderTargetBitmap` is already created with `PixelFormats.Pbgra32` and `DrawingContext.DrawImage` will rasterize any `BitmapSource` into that format; remove the entire conversion and the `normalized` variable and use the original `image` directly where `normalized` is referenced (keep `RenderTargetBitmap` creation and `DrawingContext.DrawImage(image, ...)` as-is) to avoid the extra allocation and work.
60-69: Dead defensive checks — output format is fixed.Because
rtbis constructed on line 41 withPixelFormats.Pbgra32,bppis always 32 andbufferSizeis alwaysHashSize * HashSize * 4. Thebpp < 1andbufferSize < 1guards can never trigger and just add noise. Also, since you now control the format/dimensions,bytesPerPixel/stridecan be hard-coded constants, making intent clearer.♻️ Proposed simplification
- // Extract raw pixel bytes - var bpp = rtb.Format.BitsPerPixel; - if (bpp < 1) - return null; - - var bytesPerPixel = (bpp + 7) / 8; - var stride = rtb.PixelWidth * bytesPerPixel; - - var bufferSize = stride * rtb.PixelHeight; - if (bufferSize < 1) - return null; - - var pixels = new byte[bufferSize]; + // Pbgra32 => 4 bytes per pixel, no row padding for HashSize-aligned widths. + const int BytesPerPixel = 4; + var stride = HashSize * BytesPerPixel; + var pixels = new byte[stride * HashSize]; rtb.CopyPixels(pixels, stride, 0);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs` around lines 60 - 69, The checks for bpp < 1 and bufferSize < 1 are dead because rtb is constructed with PixelFormats.Pbgra32 (so bpp == 32) and dimensions are fixed; remove the redundant guards and simplify by hard-coding bytesPerPixel = 4 and computing stride as HashSize * bytesPerPixel (or rtb.PixelWidth * 4) and bufferSize as stride * HashSize, replacing the computed bpp/bytesPerPixel/stride/bufferSize logic in ImageHashGenerator (references: rtb, PixelFormats.Pbgra32, bpp, bytesPerPixel, stride, bufferSize, HashSize) to make the intent explicit and eliminate unreachable conditionals.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs`:
- Around line 41-57: GetHashFromImage currently calls
RenderTargetBitmap.Render(dv) from thread-pool threads (invoked via
ImageLoader.LoadAsync / InitializeAsync), which is not Dispatcher-safe; marshal
the rendering (and subsequent Freeze()) to the UI dispatcher: wrap
rtb.Render(dv) and rtb.Freeze() in Application.Current?.Dispatcher?.Invoke (or
BeginInvoke and wait) so the RenderTargetBitmap operations run on the Dispatcher
thread, and handle the case Application.Current is null by falling back to an
alternative path (e.g., decode/CopyPixels) or throwing a clear exception; update
references in GetHashFromImage to perform the Dispatcher invoke around the
RenderTargetBitmap.Render and Freeze calls.
---
Nitpick comments:
In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs`:
- Around line 26-38: The FormatConvertedBitmap conversion block (creating
`converted`, calling BeginInit/EndInit/Freeze and assigning to `normalized`) is
redundant because the subsequent `RenderTargetBitmap` is already created with
`PixelFormats.Pbgra32` and `DrawingContext.DrawImage` will rasterize any
`BitmapSource` into that format; remove the entire conversion and the
`normalized` variable and use the original `image` directly where `normalized`
is referenced (keep `RenderTargetBitmap` creation and
`DrawingContext.DrawImage(image, ...)` as-is) to avoid the extra allocation and
work.
- Around line 60-69: The checks for bpp < 1 and bufferSize < 1 are dead because
rtb is constructed with PixelFormats.Pbgra32 (so bpp == 32) and dimensions are
fixed; remove the redundant guards and simplify by hard-coding bytesPerPixel = 4
and computing stride as HashSize * bytesPerPixel (or rtb.PixelWidth * 4) and
bufferSize as stride * HashSize, replacing the computed
bpp/bytesPerPixel/stride/bufferSize logic in ImageHashGenerator (references:
rtb, PixelFormats.Pbgra32, bpp, bytesPerPixel, stride, bufferSize, HashSize) to
make the intent explicit and eliminate unreachable conditionals.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: b0fd1b1a-38be-4581-8439-66f5a464499e
📒 Files selected for processing (1)
Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs (1)
53-56: Silentcatchswallows all exceptions — at least log.Returning
nullon any failure means the caller (ImageLoader.LoadAsync) silently bypasses hash-based dedup inGuidToKeyfor that image and may cache it under a fresh key on every load, with no diagnostic trail. Logging at debug/warn (consistent with the rest ofImageLoader, which usesLog.Exception) would make regressions in this hot path observable.📝 Suggested change
- catch - { - return null; - } + catch (Exception ex) + { + Log.Exception(nameof(ImageHashGenerator), $"Failed to compute image hash: {ex.Message}", ex); + return null; + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs` around lines 53 - 56, The empty catch in ImageHashGenerator (the try/catch that currently just "return null") swallows all exceptions—change it to catch Exception (e) and log the exception before returning null so failures are observable; specifically, inside ImageHashGenerator's hash-generation method log the caught exception (using the same logging facility used by ImageLoader.LoadAsync, e.g., Log.Exception or the project logger) at Debug/Warn with context that hash generation failed, then return null so behavior is unchanged but now diagnostic information is recorded.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs`:
- Around line 53-56: The empty catch in ImageHashGenerator (the try/catch that
currently just "return null") swallows all exceptions—change it to catch
Exception (e) and log the exception before returning null so failures are
observable; specifically, inside ImageHashGenerator's hash-generation method log
the caught exception (using the same logging facility used by
ImageLoader.LoadAsync, e.g., Log.Exception or the project logger) at Debug/Warn
with context that hash generation failed, then return null so behavior is
unchanged but now diagnostic information is recorded.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: f9b5d6e5-978c-4370-939e-a4746631d655
📒 Files selected for processing (1)
Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs
There was a problem hiding this comment.
1 issue found across 1 file (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs">
<violation number="1" location="Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs:46">
P1: Image hash generation no longer normalizes to fixed dimensions, making hashes source-size dependent and increasing allocation with input size.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
91cd482 to
43da838
Compare
Jack251970
left a comment
There was a problem hiding this comment.
Please resolve my comments. And here is a edited version for your reference.
using System;
using System.Buffers;
using System.Security.Cryptography;
using System.Windows.Media;
using System.Windows.Media.Imaging;
namespace Flow.Launcher.Infrastructure.Image
{
public interface IImageHashGenerator
{
string GetHashFromImage(ImageSource image);
}
public class ImageHashGenerator : IImageHashGenerator
{
public string GetHashFromImage(ImageSource imageSource)
{
if (imageSource is not BitmapSource { IsFrozen: true } image)
{
return null;
}
try
{
// Normalize to a deterministic pixel format (32-bit premultiplied BGRA)
BitmapSource normalized = image;
if (normalized.Format != PixelFormats.Pbgra32)
{
var converted = new FormatConvertedBitmap();
converted.BeginInit();
converted.Source = normalized;
converted.DestinationFormat = PixelFormats.Pbgra32;
converted.EndInit();
converted.Freeze();
normalized = converted;
}
// Since we forced Pbgra32, we know it is exactly 4 bytes per pixel.
int bytesPerPixel = 4;
int stride = normalized.PixelWidth * bytesPerPixel;
int bufferSize = stride * normalized.PixelHeight;
if (bufferSize <= 0)
return null;
// Use ArrayPool to prevent Large Object Heap (LOH) allocations for big images
byte[] pixels = ArrayPool<byte>.Shared.Rent(bufferSize);
try
{
normalized.CopyPixels(pixels, stride, 0);
// HashData accepts a Span, so we slice the rented array to the exact buffer size
// (Rented arrays are often larger than the requested size)
var hashBytes = SHA1.HashData(new ReadOnlySpan<byte>(pixels, 0, bufferSize));
return Convert.ToBase64String(hashBytes);
}
finally
{
// Always return the array to the pool
ArrayPool<byte>.Shared.Return(pixels);
}
}
catch
{
return null;
}
}
}
}
| if (bpp < 1) | ||
| return null; | ||
|
|
||
| var bytesPerPixel = (bpp + 7) / 8; |
There was a problem hiding this comment.
Because you explicitly force the image format to PixelFormats.Pbgra32 earlier in the method, you already know with 100% certainty that the BitsPerPixel is 32.
You can replace the dynamic math ((bpp + 7) / 8) with a hardcoded 4 bytes per pixel.
|
|
||
| var bytesPerPixel = (bpp + 7) / 8; | ||
| var stride = normalized.PixelWidth * bytesPerPixel; | ||
| var pixels = new byte[stride * normalized.PixelHeight]; |
There was a problem hiding this comment.
If a user passes in a 4K image, this will allocate a ~33 MB byte array on the Large Object Heap. Doing this frequently (e.g., in a launcher app parsing many icons/images) will cause significant Garbage Collection (GC) pressure and micro-stutters.
Fix: Use ArrayPool.Shared.Rent() to reuse memory buffers instead of allocating new ones every time.
064a9ca to
e436287
Compare
Closes #4116
Summary by cubic
Makes image hashing deterministic and format/DPI-agnostic by normalizing to
Pbgra32and hashing raw pixels. Prevents collisions across formats/metadata and reduces allocations.Pbgra32pixels viaSHA1.HashDatainstead of JPEG-encoded buffers.BitmapSourceinputs withIsFrozen; others return null.Pbgra32, freeze the converted bitmap, guard zero-size images, and useArrayPool<byte>for pixel storage.MemoryStream,JpegBitmapEncoder,BitmapFrame, and any reliance on encoder output or metadata (including DPI/EXIF).ArrayPool<byte>, returns it after use to avoid LOH allocations and reduce GC pressure.Written for commit e436287. Summary will update on new commits. Review in cubic