Skip to content

Commit b3b9bd1

Browse files
Merge pull request #18 from dev-diaries41/staging
Merge v1.1.1 changes to main
2 parents 5f3fd0f + ba22110 commit b3b9bd1

File tree

36 files changed

+31448
-442
lines changed

36 files changed

+31448
-442
lines changed

CHANGELOG.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,16 @@
1+
### v1.1.1 - 02/11/2025
2+
3+
### Added
4+
* Added new text embedding provider, Mini-LM
5+
6+
### Changed
7+
* IEmbeddingProvider is require to provider `embeddingDim` variable (used to be optional)
8+
* Renamed `embeddingLength` to `embeddingDim` for `FileEmbeddingStore` constructor param
9+
* Move interfaces:
10+
- Moved to core/embeddings: `IEmbeddingStore`, `IRetriever`, `IEmbeddingProvider`
11+
- Moved to core/processor: `IProcessorListener`
12+
- Moved to ml/models: `IModelLoader`
13+
114
### v1.1.0 - 30/10/2025
215

316
### Changed

README.md

Lines changed: 124 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
## Table of Contents
44

55
* [Overview](#overview)
6+
* [Quick Start](#quick-start)
67
* [Documentation](docs/README.md)
78
* [Key Structure](#key-structure)
89
* [Installation](#installation)
@@ -11,7 +12,7 @@
1112
+ [2. Install ML Module (Optional)](#2-install-ml-module-optional)
1213
* [Design Choices](#design-choices)
1314

14-
+ [Core and ML](#core-and-ml)``
15+
+ [Core and ML](#core-and-ml)
1516
+ [Constraints](#constraints)
1617
+ [Model](#model)
1718
+ [Embedding Storage](#embedding-storage)
@@ -35,7 +36,125 @@ SmartScanSdk is a modular Android SDK that powers the **SmartScan app**. It prov
3536
3637
---
3738

38-
## **Key Structure**
39+
## Quick Start
40+
41+
Below is information on how to get started with embedding, indexing, and searching.
42+
43+
### Embeddings
44+
45+
#### Text Embeddings
46+
47+
Generate vector embeddings from text strings or batches of text for tasks such as semantic search or similarity comparison.
48+
49+
**Usage Example:**
50+
51+
```kotlin
52+
//import com.fpf.smartscansdk.ml.models.providers.embeddings.clip.ClipTextEmbedder
53+
54+
// Requires model to be in raw resources at e.g res/raw/text_encoder_quant_int8.onnx
55+
val textEmbedder = ClipTextEmbedder(context, ResourceId(R.raw.text_encoder_quant_int8))
56+
val text = "Hello smartscan"
57+
val embedding = textEmbedder.embed(text)
58+
59+
```
60+
61+
**Batch Example:**
62+
63+
```kotlin
64+
val texts = listOf("first sentence", "second sentence")
65+
val embeddings = textEmbedder.embedBatch(texts)
66+
```
67+
68+
---
69+
70+
#### Image Embeddings
71+
72+
Generate vector embeddings from images (as `Bitmap`) for visual search or similarity tasks.
73+
74+
**Usage Example**
75+
76+
```kotlin
77+
//import com.fpf.smartscansdk.ml.models.providers.embeddings.clip.ClipImageEmbedder
78+
79+
// Requires model to be in raw resources at e.g res/raw/image_encoder_quant_int8.onnx
80+
val imageEmbedder = ClipImageEmbedder(context, ResourceId(R.raw.image_encoder_quant_int8))
81+
82+
val embedding = imageEmbedder.embed(bitmap)
83+
84+
```
85+
86+
87+
**Batch Example:**
88+
89+
```kotlin
90+
val images: List<Bitmap> = ...
91+
val embeddings = imageEmbedder.embedBatch(images)
92+
```
93+
94+
### Indexing
95+
96+
To get started with indexing media quickly, you can use the provided `ImageIndex` and `VideoIndexer` classes as shown below. You can optionally create your own indexers (including for text related data) by implementing the `BatchProcessor` interface. See docs for more details.
97+
98+
#### Image Indexing
99+
100+
Index images to enable similarity search. The index is saved as a binary file and managed with a FileEmbeddingStore.
101+
> **Important**: During indexing the MediaStore Id is used to as the id in the `Embedding` which is stored. This can later be used for retrieval.
102+
103+
104+
```kotlin
105+
val imageEmbedder = ClipImageEmbedder(context, ResourceId(R.raw.image_encoder_quant_int8))
106+
val imageStore = FileEmbeddingStore(File(context.filesDir, "image_index.bin"), imageEmbedder.embeddingDim, useCache = false) // cache not needed for indexing
107+
val imageIndexer = ImageIndexer(imageEmbedder, context=context, listener = null, store = imageStore) //optionally pass a listener to handle events
108+
val ids = getImageIds() // placeholder function to get MediaStore image ids
109+
imageIndexer.run(ids)
110+
```
111+
112+
#### Video Indexing
113+
114+
Index videos to enable similarity search. The index is saved as a binary file and managed with a FileEmbeddingStore.
115+
116+
```kotlin
117+
val imageEmbedder = ClipImageEmbedder(context, ResourceId(R.raw.image_encoder_quant_int8))
118+
val videoStore = FileEmbeddingStore(File(context.filesDir, "video_index.bin"), imageEmbedder.embeddingDim, useCache = false )
119+
val videoIndexer = VideoIndexer(imageEmbedder, context=context, listener = null, store = videoStore, width = ClipConfig.IMAGE_SIZE_X, height = ClipConfig.IMAGE_SIZE_Y)
120+
val ids = getVideoIds() // placeholder function to get MediaStore video ids
121+
videoIndexer.run(ids)
122+
```
123+
124+
### Searching
125+
126+
Below shows how to search using both text queries and an image. The returns results are List<Embedding>. You can use the id from each one, which corresponds to the MediaStore id, to retrieve the result images.
127+
128+
#### Text-to-Image Search
129+
130+
```kotlin
131+
val imageStore = FileEmbeddingStore(File(context.filesDir, "image_index.bin"), imageEmbedder.embeddingDim, useCache = false) // cache not needed for indexing
132+
val imageRetriever = FileEmbeddingRetriever(imageStore)
133+
val textEmbedder = ClipTextEmbedder(context, ResourceId(R.raw.text_encoder_quant_int8))
134+
val query = "my search query"
135+
val embedding = textEmbedder.embed(query)
136+
val topK = 20
137+
val similarityThreshold = 0.2f
138+
val results = retriever.query(embedding, topK, similarityThreshold)
139+
140+
```
141+
142+
#### Reverse Image Search
143+
144+
```kotlin
145+
val imageStore = FileEmbeddingStore(File(context.filesDir, "image_index.bin"), imageEmbedder.embeddingDim, useCache = false) // cache not needed for indexing
146+
val imageRetriever = FileEmbeddingRetriever(imageStore)
147+
val imageEmbedder = ClipImageEmbedder(context, ResourceId(R.raw.image_encoder_quant_int8))
148+
val embedding = imageEmbedder.embed(bitmap)
149+
val topK = 20
150+
val similarityThreshold = 0.2f
151+
val results = retriever.query(embedding, topK, similarityThreshold)
152+
153+
```
154+
155+
---
156+
157+
## Key Structure
39158

40159
```
41160
SmartScanSdk/
@@ -66,7 +185,7 @@ SmartScanSdk/
66185

67186
---
68187

69-
## **Installation**
188+
## Installation
70189

71190
### **1. Install Core Module**
72191

@@ -84,7 +203,7 @@ implementation("com.github.dev-diaries41.smartscan-sdk:smartscan-ml:1.1.0")
84203
85204
---
86205

87-
## **Design Choices**
206+
## Design Choices
88207

89208
### Core and ML
90209

@@ -140,7 +259,7 @@ File-based memory-mapped loading is significantly faster and scales better.
140259

141260
___
142261

143-
## **Gradle / Kotlin Setup Notes**
262+
## Gradle / Kotlin Setup Notes
144263

145264
* Java 17 / Kotlin JVM 17
146265
* `compileSdk = 36`, `targetSdk = 34`, `minSdk = 30`
Lines changed: 0 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
package com.fpf.smartscansdk.core.data
22

33

4-
import android.graphics.Bitmap
5-
64
// `Embedding` represents a raw vector for a single media item, with `id` corresponding to its `MediaStoreId`.
75
data class Embedding(
86
val id: Long,
@@ -18,33 +16,5 @@ data class PrototypeEmbedding(
1816
)
1917

2018

21-
interface IEmbeddingStore {
22-
val isCached: Boolean
23-
val exists: Boolean
24-
suspend fun add(newEmbeddings: List<Embedding>)
25-
suspend fun remove(ids: List<Long>)
26-
suspend fun get(): List<Embedding>
27-
fun clear()
28-
}
29-
30-
interface IRetriever {
31-
suspend fun query(
32-
embedding: FloatArray,
33-
topK: Int,
34-
threshold: Float
35-
): List<Embedding>
36-
}
37-
38-
39-
interface IEmbeddingProvider<T> {
40-
val embeddingDim: Int? get() = null
41-
fun closeSession() = Unit
42-
suspend fun embed(data: T): FloatArray
43-
suspend fun embedBatch(data: List<T>): List<FloatArray>
44-
}
45-
46-
47-
typealias TextEmbeddingProvider = IEmbeddingProvider<String>
48-
typealias ImageEmbeddingProvider = IEmbeddingProvider<Bitmap>
4919

5020

core/src/main/java/com/fpf/smartscansdk/core/data/Processors.kt

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,5 @@
11
package com.fpf.smartscansdk.core.data
22

3-
import android.content.Context
4-
5-
interface IProcessorListener<Input, Output> {
6-
suspend fun onActive(context: Context) = Unit
7-
suspend fun onBatchComplete(context: Context, batch: List<Output>) = Unit
8-
suspend fun onComplete(context: Context, metrics: Metrics.Success) = Unit
9-
suspend fun onProgress(context: Context, progress: Float) = Unit
10-
fun onError(context: Context, error: Exception, item: Input) = Unit
11-
suspend fun onFail(context: Context, failureMetrics: Metrics.Failure) = Unit
12-
}
133

144
sealed class Metrics {
155
data class Success(val totalProcessed: Int = 0, val timeElapsed: Long = 0L) : Metrics()

core/src/main/java/com/fpf/smartscansdk/core/embeddings/FileEmbeddingRetriever.kt

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,6 @@
11
package com.fpf.smartscansdk.core.embeddings
22

33
import com.fpf.smartscansdk.core.data.Embedding
4-
import com.fpf.smartscansdk.core.data.IRetriever
54

65
class FileEmbeddingRetriever(
76
private val store: FileEmbeddingStore

core/src/main/java/com/fpf/smartscansdk/core/embeddings/FileEmbeddingStore.kt

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,6 @@ package com.fpf.smartscansdk.core.embeddings
22

33
import android.util.Log
44
import com.fpf.smartscansdk.core.data.Embedding
5-
import com.fpf.smartscansdk.core.data.IEmbeddingStore
65
import kotlinx.coroutines.Dispatchers
76
import kotlinx.coroutines.withContext
87
import java.io.File
@@ -16,7 +15,7 @@ import java.nio.channels.FileChannel
1615

1716
class FileEmbeddingStore(
1817
private val file: File,
19-
private val embeddingLength: Int,
18+
private val embeddingDimension: Int,
2019
val useCache: Boolean = true,
2120
):
2221
IEmbeddingStore {
@@ -54,12 +53,12 @@ class FileEmbeddingStore(
5453
val batch = embeddingsList.subList(index, end)
5554

5655
// Allocate a smaller buffer for this batch
57-
val batchBuffer = ByteBuffer.allocate(batch.size * (8 + 8 + embeddingLength * 4))
56+
val batchBuffer = ByteBuffer.allocate(batch.size * (8 + 8 + embeddingDimension * 4))
5857
.order(ByteOrder.LITTLE_ENDIAN)
5958

6059
for (embedding in batch) {
61-
if (embedding.embeddings.size != embeddingLength) {
62-
throw IllegalArgumentException("Embedding length must be $embeddingLength")
60+
if (embedding.embeddings.size != embeddingDimension) {
61+
throw IllegalArgumentException("Embedding dimension must be $embeddingDimension")
6362
}
6463
batchBuffer.putLong(embedding.id)
6564
batchBuffer.putLong(embedding.date)
@@ -89,10 +88,10 @@ class FileEmbeddingStore(
8988
repeat(count) {
9089
val id = buffer.long
9190
val date = buffer.long
92-
val floats = FloatArray(embeddingLength)
91+
val floats = FloatArray(embeddingDimension)
9392
val fb = buffer.asFloatBuffer()
9493
fb.get(floats)
95-
buffer.position(buffer.position() + embeddingLength * 4)
94+
buffer.position(buffer.position() + embeddingDimension * 4)
9695
map[id] = Embedding(id, date, floats)
9796
}
9897
if (useCache) cache = map
@@ -122,7 +121,7 @@ class FileEmbeddingStore(
122121
val existingCount = headerBuf.int
123122

124123
// Basic validation: each existing entry is at least id(8)+date(8)+EMBEDDING_LEN*4
125-
val minEntryBytes = 8 + 8 + embeddingLength * 4
124+
val minEntryBytes = 8 + 8 + embeddingDimension * 4
126125
val maxCountFromSize = (channel.size() / minEntryBytes).toInt()
127126
if (existingCount < 0 || existingCount > maxCountFromSize + 10_000) {
128127
throw IOException("Corrupt embeddings header: count=$existingCount, fileSize=${channel.size()}")
@@ -140,10 +139,10 @@ class FileEmbeddingStore(
140139
channel.position(channel.size())
141140

142141
for (embedding in newEmbeddings) {
143-
if (embedding.embeddings.size != embeddingLength) {
144-
throw IllegalArgumentException("Embedding length must be $embeddingLength")
142+
if (embedding.embeddings.size != embeddingDimension) {
143+
throw IllegalArgumentException("Embedding dimension must be $embeddingDimension")
145144
}
146-
val entryBytes = (8 + 8) + embeddingLength * 4
145+
val entryBytes = (8 + 8) + embeddingDimension * 4
147146
val buf = ByteBuffer.allocate(entryBytes).order(ByteOrder.LITTLE_ENDIAN)
148147
buf.putLong(embedding.id)
149148
buf.putLong(embedding.date)
Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
package com.fpf.smartscansdk.core.embeddings
2+
3+
import android.graphics.Bitmap
4+
5+
6+
interface IEmbeddingProvider<T> {
7+
val embeddingDim: Int
8+
fun closeSession() = Unit
9+
suspend fun embed(data: T): FloatArray
10+
suspend fun embedBatch(data: List<T>): List<FloatArray>
11+
}
12+
13+
14+
typealias TextEmbeddingProvider = IEmbeddingProvider<String>
15+
typealias ImageEmbeddingProvider = IEmbeddingProvider<Bitmap>
16+
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
package com.fpf.smartscansdk.core.embeddings
2+
3+
import com.fpf.smartscansdk.core.data.Embedding
4+
5+
interface IEmbeddingStore {
6+
val isCached: Boolean
7+
val exists: Boolean
8+
suspend fun add(newEmbeddings: List<Embedding>)
9+
suspend fun remove(ids: List<Long>)
10+
suspend fun get(): List<Embedding>
11+
fun clear()
12+
}
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
package com.fpf.smartscansdk.core.embeddings
2+
3+
import com.fpf.smartscansdk.core.data.Embedding
4+
5+
interface IRetriever {
6+
suspend fun query(
7+
embedding: FloatArray,
8+
topK: Int,
9+
threshold: Float
10+
): List<Embedding>
11+
}
12+

core/src/main/java/com/fpf/smartscansdk/core/indexers/ImageIndexer.kt

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4,12 +4,12 @@ import android.content.ContentUris
44
import android.content.Context
55
import android.provider.MediaStore
66
import com.fpf.smartscansdk.core.data.Embedding
7-
import com.fpf.smartscansdk.core.data.IEmbeddingStore
8-
import com.fpf.smartscansdk.core.data.IProcessorListener
9-
import com.fpf.smartscansdk.core.data.ImageEmbeddingProvider
107
import com.fpf.smartscansdk.core.data.ProcessOptions
8+
import com.fpf.smartscansdk.core.embeddings.IEmbeddingStore
9+
import com.fpf.smartscansdk.core.embeddings.ImageEmbeddingProvider
1110
import com.fpf.smartscansdk.core.media.getBitmapFromUri
1211
import com.fpf.smartscansdk.core.processors.BatchProcessor
12+
import com.fpf.smartscansdk.core.processors.IProcessorListener
1313
import kotlinx.coroutines.NonCancellable
1414
import kotlinx.coroutines.withContext
1515

0 commit comments

Comments
 (0)