Skip to content

Make image hashing deterministic and format/DPI-agnostic (#4116)#4432

Open
benello wants to merge 3 commits intoFlow-Launcher:devfrom
benello:Thumbnail-hash-collision-fix-(#4116)
Open

Make image hashing deterministic and format/DPI-agnostic (#4116)#4432
benello wants to merge 3 commits intoFlow-Launcher:devfrom
benello:Thumbnail-hash-collision-fix-(#4116)

Conversation

@benello
Copy link
Copy Markdown

@benello benello commented Apr 26, 2026

Closes #4116


Summary by cubic

Makes image hashing deterministic and format/DPI-agnostic by normalizing to Pbgra32 and hashing raw pixels. Prevents collisions across formats/metadata and reduces allocations.

  • Summary of changes
    • Changed: Hash raw Pbgra32 pixels via SHA1.HashData instead of JPEG-encoded buffers.
    • Changed: Only accept BitmapSource inputs with IsFrozen; others return null.
    • Added: Normalize to Pbgra32, freeze the converted bitmap, guard zero-size images, and use ArrayPool<byte> for pixel storage.
    • Removed: MemoryStream, JpegBitmapEncoder, BitmapFrame, and any reliance on encoder output or metadata (including DPI/EXIF).
    • Memory: Rents a width×height×4 buffer from ArrayPool<byte>, returns it after use to avoid LOH allocations and reduce GC pressure.
    • Security: No new risks; SHA‑1 used only for IDs, pooled buffers are returned and not exposed.
    • Tests: No unit tests added.

Written for commit e436287. Summary will update on new commits. Review in cubic

@github-actions github-actions Bot added this to the 2.2.0 milestone Apr 26, 2026
@coderabbitai coderabbitai Bot added the enhancement New feature or request label Apr 26, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 1 file

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 26, 2026

📝 Walkthrough

Walkthrough

Replaces encoder-based hashing with deterministic raw-pixel hashing. GetHashFromImage now only accepts frozen BitmapSource, normalizes to PixelFormats.Pbgra32 when needed, copies pixels with CopyPixels, computes SHA-1 over the pixel buffer via SHA1.HashData, and returns a Base64 digest. Encoder/MemoryStream path removed.

Changes

Cohort / File(s) Summary
Image hash implementation
Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs
Removed JPEG encoder + MemoryStream hashing. GetHashFromImage now returns null for non-frozen/non-BitmapSource inputs; converts images to PixelFormats.Pbgra32 when required, rents an ArrayPool<byte> buffer, copies raw pixels with CopyPixels, computes SHA-1 via SHA1.HashData, and returns Base64. Removed previous System.IO/encoder and SHA1.Create().ComputeHash usage.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • Fix Wallpaper File Lock #3986 — Changes around ImageLoader/ImageHashGenerator usage and handling of frozen BitmapSource; directly related to hashing/freeze behavior.

Suggested reviewers

  • jjw24
  • onesounds
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately summarizes the main change: making image hashing deterministic and format/DPI-agnostic, with issue reference.
Linked Issues check ✅ Passed The changes fully address issue #4116 by normalizing to Pbgra32, hashing raw pixels with SHA1.HashData, freezing bitmaps, guarding zero-size images, using ArrayPool, and removing encoder/metadata dependencies.
Out of Scope Changes check ✅ Passed All changes in ImageHashGenerator.cs are directly scoped to resolving hash collisions through deterministic raw pixel hashing; no unrelated modifications detected.
Description check ✅ Passed The pull request description is directly related to the changeset and clearly explains the objectives and implementation of the image hashing determinism fix.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs (2)

26-38: Redundant FormatConvertedBitmap step — RenderTargetBitmap already normalizes the format.

The destination RenderTargetBitmap on line 41 is created with PixelFormats.Pbgra32, and DrawingContext.DrawImage will render any BitmapSource into that target regardless of the source format. The explicit FormatConvertedBitmap conversion is unnecessary work (and an extra allocation) per call.

♻️ Proposed simplification
             try
             {
-                // Normalize to a deterministic pixel format (32-bit premultiplied BGRA)
-                BitmapSource normalized = image;
-                if (normalized.Format != PixelFormats.Pbgra32)
-                {
-                    var converted = new FormatConvertedBitmap();
-                    converted.BeginInit();
-                    converted.Source = normalized;
-                    converted.DestinationFormat = PixelFormats.Pbgra32;
-                    converted.EndInit();
-                    converted.Freeze();
-
-                    normalized = converted;
-                }
-
-                // Draw into a fixed-size RenderTarget at 96 DPI to ensure deterministic pixel buffer
+                // Draw into a fixed-size Pbgra32 RenderTarget at 96 DPI to ensure deterministic pixel buffer
                 var rtb = new RenderTargetBitmap(HashSize, HashSize, 96, 96, PixelFormats.Pbgra32);

(adjust subsequent references from normalized to image)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs` around lines 26 -
38, The FormatConvertedBitmap conversion block (creating `converted`, calling
BeginInit/EndInit/Freeze and assigning to `normalized`) is redundant because the
subsequent `RenderTargetBitmap` is already created with `PixelFormats.Pbgra32`
and `DrawingContext.DrawImage` will rasterize any `BitmapSource` into that
format; remove the entire conversion and the `normalized` variable and use the
original `image` directly where `normalized` is referenced (keep
`RenderTargetBitmap` creation and `DrawingContext.DrawImage(image, ...)` as-is)
to avoid the extra allocation and work.

60-69: Dead defensive checks — output format is fixed.

Because rtb is constructed on line 41 with PixelFormats.Pbgra32, bpp is always 32 and bufferSize is always HashSize * HashSize * 4. The bpp < 1 and bufferSize < 1 guards can never trigger and just add noise. Also, since you now control the format/dimensions, bytesPerPixel/stride can be hard-coded constants, making intent clearer.

♻️ Proposed simplification
-                // Extract raw pixel bytes
-                var bpp = rtb.Format.BitsPerPixel;
-                if (bpp < 1)
-                    return null;
-
-                var bytesPerPixel = (bpp + 7) / 8;
-                var stride = rtb.PixelWidth * bytesPerPixel;
-
-                var bufferSize = stride * rtb.PixelHeight;
-                if (bufferSize < 1)
-                    return null;
-
-                var pixels = new byte[bufferSize];
+                // Pbgra32 => 4 bytes per pixel, no row padding for HashSize-aligned widths.
+                const int BytesPerPixel = 4;
+                var stride = HashSize * BytesPerPixel;
+                var pixels = new byte[stride * HashSize];
                 rtb.CopyPixels(pixels, stride, 0);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs` around lines 60 -
69, The checks for bpp < 1 and bufferSize < 1 are dead because rtb is
constructed with PixelFormats.Pbgra32 (so bpp == 32) and dimensions are fixed;
remove the redundant guards and simplify by hard-coding bytesPerPixel = 4 and
computing stride as HashSize * bytesPerPixel (or rtb.PixelWidth * 4) and
bufferSize as stride * HashSize, replacing the computed
bpp/bytesPerPixel/stride/bufferSize logic in ImageHashGenerator (references:
rtb, PixelFormats.Pbgra32, bpp, bytesPerPixel, stride, bufferSize, HashSize) to
make the intent explicit and eliminate unreachable conditionals.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs`:
- Around line 41-57: GetHashFromImage currently calls
RenderTargetBitmap.Render(dv) from thread-pool threads (invoked via
ImageLoader.LoadAsync / InitializeAsync), which is not Dispatcher-safe; marshal
the rendering (and subsequent Freeze()) to the UI dispatcher: wrap
rtb.Render(dv) and rtb.Freeze() in Application.Current?.Dispatcher?.Invoke (or
BeginInvoke and wait) so the RenderTargetBitmap operations run on the Dispatcher
thread, and handle the case Application.Current is null by falling back to an
alternative path (e.g., decode/CopyPixels) or throwing a clear exception; update
references in GetHashFromImage to perform the Dispatcher invoke around the
RenderTargetBitmap.Render and Freeze calls.

---

Nitpick comments:
In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs`:
- Around line 26-38: The FormatConvertedBitmap conversion block (creating
`converted`, calling BeginInit/EndInit/Freeze and assigning to `normalized`) is
redundant because the subsequent `RenderTargetBitmap` is already created with
`PixelFormats.Pbgra32` and `DrawingContext.DrawImage` will rasterize any
`BitmapSource` into that format; remove the entire conversion and the
`normalized` variable and use the original `image` directly where `normalized`
is referenced (keep `RenderTargetBitmap` creation and
`DrawingContext.DrawImage(image, ...)` as-is) to avoid the extra allocation and
work.
- Around line 60-69: The checks for bpp < 1 and bufferSize < 1 are dead because
rtb is constructed with PixelFormats.Pbgra32 (so bpp == 32) and dimensions are
fixed; remove the redundant guards and simplify by hard-coding bytesPerPixel = 4
and computing stride as HashSize * bytesPerPixel (or rtb.PixelWidth * 4) and
bufferSize as stride * HashSize, replacing the computed
bpp/bytesPerPixel/stride/bufferSize logic in ImageHashGenerator (references:
rtb, PixelFormats.Pbgra32, bpp, bytesPerPixel, stride, bufferSize, HashSize) to
make the intent explicit and eliminate unreachable conditionals.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b0fd1b1a-38be-4581-8439-66f5a464499e

📥 Commits

Reviewing files that changed from the base of the PR and between 1818da7 and d9e9067.

📒 Files selected for processing (1)
  • Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs

Comment thread Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs (1)

53-56: Silent catch swallows all exceptions — at least log.

Returning null on any failure means the caller (ImageLoader.LoadAsync) silently bypasses hash-based dedup in GuidToKey for that image and may cache it under a fresh key on every load, with no diagnostic trail. Logging at debug/warn (consistent with the rest of ImageLoader, which uses Log.Exception) would make regressions in this hot path observable.

📝 Suggested change
-            catch
-            {
-                return null;
-            }
+            catch (Exception ex)
+            {
+                Log.Exception(nameof(ImageHashGenerator), $"Failed to compute image hash: {ex.Message}", ex);
+                return null;
+            }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs` around lines 53 -
56, The empty catch in ImageHashGenerator (the try/catch that currently just
"return null") swallows all exceptions—change it to catch Exception (e) and log
the exception before returning null so failures are observable; specifically,
inside ImageHashGenerator's hash-generation method log the caught exception
(using the same logging facility used by ImageLoader.LoadAsync, e.g.,
Log.Exception or the project logger) at Debug/Warn with context that hash
generation failed, then return null so behavior is unchanged but now diagnostic
information is recorded.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs`:
- Around line 53-56: The empty catch in ImageHashGenerator (the try/catch that
currently just "return null") swallows all exceptions—change it to catch
Exception (e) and log the exception before returning null so failures are
observable; specifically, inside ImageHashGenerator's hash-generation method log
the caught exception (using the same logging facility used by
ImageLoader.LoadAsync, e.g., Log.Exception or the project logger) at Debug/Warn
with context that hash generation failed, then return null so behavior is
unchanged but now diagnostic information is recorded.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f9b5d6e5-978c-4370-939e-a4746631d655

📥 Commits

Reviewing files that changed from the base of the PR and between d9e9067 and 91cd482.

📒 Files selected for processing (1)
  • Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs

Comment thread Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs">

<violation number="1" location="Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs:46">
P1: Image hash generation no longer normalizes to fixed dimensions, making hashes source-size dependent and increasing allocation with input size.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread Flow.Launcher.Infrastructure/Image/ImageHashGenerator.cs
@benello benello force-pushed the Thumbnail-hash-collision-fix-(#4116) branch from 91cd482 to 43da838 Compare April 26, 2026 15:52
Copy link
Copy Markdown
Member

@Jack251970 Jack251970 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resolve my comments. And here is a edited version for your reference.

using System;
using System.Buffers;
using System.Security.Cryptography;
using System.Windows.Media;
using System.Windows.Media.Imaging;

namespace Flow.Launcher.Infrastructure.Image
{
public interface IImageHashGenerator
{
string GetHashFromImage(ImageSource image);
}

public class ImageHashGenerator : IImageHashGenerator
{
    public string GetHashFromImage(ImageSource imageSource)
    {
        if (imageSource is not BitmapSource { IsFrozen: true } image)
        {
            return null;
        }

        try
        {
            // Normalize to a deterministic pixel format (32-bit premultiplied BGRA)
            BitmapSource normalized = image;
            if (normalized.Format != PixelFormats.Pbgra32)
            {
                var converted = new FormatConvertedBitmap();
                converted.BeginInit();
                converted.Source = normalized;
                converted.DestinationFormat = PixelFormats.Pbgra32;
                converted.EndInit();
                converted.Freeze();
                normalized = converted;
            }

            // Since we forced Pbgra32, we know it is exactly 4 bytes per pixel.
            int bytesPerPixel = 4;
            int stride = normalized.PixelWidth * bytesPerPixel;
            int bufferSize = stride * normalized.PixelHeight;

            if (bufferSize <= 0)
                return null;

            // Use ArrayPool to prevent Large Object Heap (LOH) allocations for big images
            byte[] pixels = ArrayPool<byte>.Shared.Rent(bufferSize);
            
            try
            {
                normalized.CopyPixels(pixels, stride, 0);

                // HashData accepts a Span, so we slice the rented array to the exact buffer size
                // (Rented arrays are often larger than the requested size)
                var hashBytes = SHA1.HashData(new ReadOnlySpan<byte>(pixels, 0, bufferSize));
                return Convert.ToBase64String(hashBytes);
            }
            finally
            {
                // Always return the array to the pool
                ArrayPool<byte>.Shared.Return(pixels);
            }
        }
        catch
        {
            return null;
        }
    }
}

}

if (bpp < 1)
return null;

var bytesPerPixel = (bpp + 7) / 8;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because you explicitly force the image format to PixelFormats.Pbgra32 earlier in the method, you already know with 100% certainty that the BitsPerPixel is 32.
You can replace the dynamic math ((bpp + 7) / 8) with a hardcoded 4 bytes per pixel.


var bytesPerPixel = (bpp + 7) / 8;
var stride = normalized.PixelWidth * bytesPerPixel;
var pixels = new byte[stride * normalized.PixelHeight];
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a user passes in a 4K image, this will allocate a ~33 MB byte array on the Large Object Heap. Doing this frequently (e.g., in a launcher app parsing many icons/images) will cause significant Garbage Collection (GC) pressure and micro-stutters.
Fix: Use ArrayPool.Shared.Rent() to reuse memory buffers instead of allocating new ones every time.

@benello benello force-pushed the Thumbnail-hash-collision-fix-(#4116) branch from 064a9ca to e436287 Compare April 29, 2026 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

BUG: Flow.Launcher.Infrastructure.Image.ImageLoader hash collisions

2 participants