Skip to content

Commit b0d97ba

Browse files
committed
Improving vector search diversity through native MMR
Signed-off-by: Bo Zhang <[email protected]>
1 parent 6bdd89b commit b0d97ba

File tree

1 file changed

+239
-0
lines changed

1 file changed

+239
-0
lines changed
Lines changed: 239 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,239 @@
1+
---
2+
layout: post
3+
title: "Improving vector search diversity through native MMR"
4+
layout: post
5+
authors:
6+
- bzhangam
7+
date: 2025-10-24
8+
has_science_table: true
9+
categories:
10+
- technical-posts
11+
meta_keywords: MMR, Maximal Marginal Relevance, search diversity, search ranking, OpenSearch 3.3, vector search
12+
meta_description: Learn how to use Maximal Marginal Relevance (MMR) in OpenSearch to make your search results more diverse.
13+
---
14+
15+
## Improving vector search diversity through native MMR
16+
17+
When it comes to search and recommendation systems, returning highly relevant results is only half the battle. Equally important is diversity — ensuring users see a range of results rather than multiple near-duplicates. OpenSearch 3.3 now supports native Maximal Marginal Relevance (MMR) for k-NN/neural queries makes this easy.
18+
19+
## What is MMR?
20+
21+
Maximal Marginal Relevance (MMR) is a re-ranking algorithm that balances relevance and diversity:
22+
23+
- **Relevance:** How well a result matches the query.
24+
25+
- **Diversity:** How different the results are from each other.
26+
27+
MMR iteratively selects results that are relevant to the query and not too similar to previously selected results. The trade-off is controlled by a diversity parameter (0 = prioritize relevance, 1 = prioritize diversity).
28+
29+
In vector search, this is particularly useful because embeddings often cluster similar results together. Without MMR, the top-k results might all look nearly identical.
30+
31+
## Native MMR in OpenSearch
32+
33+
Previously, MMR could only be implemented externally, requiring custom pipelines and extra coding. Now, OpenSearch supports native MMR directly in k-NN and neural queries using knn_vector. This simplifies your setup and reduces latency.
34+
35+
## How to Use MMR
36+
37+
### Pre-Requisites
38+
Before using Maximal Marginal Relevance (MMR) for reranking, make sure the required [system-generated search processor factories](https://docs.opensearch.org/latest/search-plugins/search-pipelines/system-generated-search-processors/) are enabled in your cluster:
39+
40+
```json
41+
PUT _cluster/settings
42+
{
43+
"persistent": {
44+
"cluster.search.enabled_system_generated_factories": [
45+
"mmr_over_sample_factory",
46+
"mmr_rerank_factory"
47+
]
48+
}
49+
}
50+
```
51+
These factories enable OpenSearch to automatically perform the oversampling and reranking steps needed for MMR.
52+
53+
### Example: Improving Diversity in Neural Search
54+
55+
Suppose we have a neural search index with a semantic field for product descriptions using a dense embedding model. You can set up your index following this [guide](https://docs.opensearch.org/latest/field-types/supported-field-types/semantic/).
56+
57+
#### Index Sample Data
58+
59+
We index a few example product descriptions:
60+
61+
```json
62+
PUT /_bulk
63+
64+
{ "update": { "_index": "my-nlp-index", "_id": "1" } }
65+
{ "doc": {"product_description": "Red apple from USA."}, "doc_as_upsert": true }
66+
67+
{ "update": { "_index": "my-nlp-index", "_id": "2" } }
68+
{ "doc": {"product_description": "Red apple from usa."}, "doc_as_upsert": true }
69+
70+
{ "update": { "_index": "my-nlp-index", "_id": "3" } }
71+
{ "doc": {"product_description": "Crispy apple."}, "doc_as_upsert": true }
72+
73+
{ "update": { "_index": "my-nlp-index", "_id": "4" } }
74+
{ "doc": {"product_description": "Red apple."}, "doc_as_upsert": true }
75+
76+
{ "update": { "_index": "my-nlp-index", "_id": "5" } }
77+
{ "doc": {"product_description": "Orange juice from usa."}, "doc_as_upsert": true }
78+
```
79+
80+
#### Query Without MMR
81+
82+
A standard neural search query for "Red apple" might look like this:
83+
```json
84+
GET /my-npl-index/_search
85+
{
86+
"size": 3,
87+
"_source": { "exclude": ["product_description_semantic_info"] },
88+
"query": {
89+
"neural": {
90+
"product_description": { "query_text": "Red apple" }
91+
}
92+
}
93+
}
94+
```
95+
Results:
96+
97+
```json
98+
"hits": [
99+
{ "_id": "4", "_score": 0.956, "_source": {"product_description": "Red apple."} },
100+
{ "_id": "1", "_score": 0.743, "_source": {"product_description": "Red apple from USA."} },
101+
{ "_id": "2", "_score": 0.743, "_source": {"product_description": "Red apple from usa."} }
102+
]
103+
```
104+
Notice how all top results are very similar — there’s little diversity in what the user sees.
105+
106+
#### Query With MMR
107+
108+
By adding MMR, we can diversify the top results while maintaining relevance:
109+
```json
110+
GET /my-npl-index/_search
111+
{
112+
"size": 3,
113+
"_source": { "exclude": ["product_description_semantic_info"] },
114+
"query": {
115+
"neural": {
116+
"product_description": { "query_text": "Red apple" }
117+
}
118+
},
119+
"ext": {
120+
"mmr": {
121+
"candidates": 10,
122+
"diversity": 0.4
123+
}
124+
}
125+
}
126+
127+
```
128+
129+
Results:
130+
```json
131+
"hits": [
132+
{ "_id": "4", "_score": 0.956, "_source": {"product_description": "Red apple."} },
133+
{ "_id": "1", "_score": 0.743, "_source": {"product_description": "Red apple from USA."} },
134+
{ "_id": "3", "_score": 0.611, "_source": {"product_description": "Crispy apple."} }
135+
]
136+
```
137+
138+
By using MMR, we introduce more diverse results (like “Crispy apple”) without sacrificing relevance for the top hits.
139+
140+
## Benchmarking MMR Reranking in OpenSearch
141+
To evaluate the performance impact of Maximal Marginal Relevance (MMR) reranking, we ran benchmark tests on OpenSearch 3.3 across both [vector search](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/vectorsearch/params/corpus/10million/faiss-cohere-768-dp.json) and [neural-search](https://github.com/opensearch-project/opensearch-benchmark-workloads/blob/main/neural_search/params/semanticfield/neural_search_semantic_field_dense_model.json) workloads. These tests help quantify the latency trade-offs introduced by MMR while highlighting the benefits of more diverse search results.
142+
143+
### Cluster configuration
144+
145+
The following OpenSearch cluster configuration was used:
146+
147+
* Version: OpenSearch 3.3
148+
* Data nodes: 3 × r6g.2xlarge
149+
* Master nodes: 3 × c6g.xlarge
150+
* Benchmark instance: c6g.large
151+
152+
### Vector Search Performance
153+
We used the cohere-1m dataset, which contains one million precomputed embeddings, to evaluate k-nearest neighbor (KNN) queries. The table below summarizes query latency (in milliseconds) for different values of k and MMR candidate sizes:
154+
155+
| **k** | **Query size** | **MMR candidates** | **KNN (p50 ms)** | **KNN (p90 ms)** | **KNN + MMR (p50 ms)** | **KNN + MMR (p90 ms)** | **p50 Δ (%)** | **p90 Δ (%)** | **p50 Δ (ms)** | **p90 Δ (ms)** |
156+
| ----- | -------------- | ------------------ | ---------------- | ---------------- | ---------------------- | ---------------------- | ------------- | ------------- | -------------- | -------------- |
157+
| 1 | 1 | 1 | 6.70 | 7.19 | 8.22 | 8.79 | 22.7 | 22.2 | 1.52 | 1.60 |
158+
| 10 | 10 | 10 | 8.09 | 8.64 | 9.14 | 9.62 | 13.0 | 11.3 | 1.05 | 0.98 |
159+
| 10 | 10 | 30 | 8.09 | 8.64 | 10.83 | 11.48 | 33.9 | 32.9 | 2.74 | 2.84 |
160+
| 10 | 10 | 50 | 8.09 | 8.64 | 11.76 | 12.55 | 45.4 | 45.3 | 3.67 | 3.91 |
161+
| 10 | 10 | 100 | 8.09 | 8.64 | 15.81 | 16.73 | 95.5 | 93.6 | 7.72 | 8.09 |
162+
| 20 | 20 | 100 | 8.13 | 8.57 | 18.66 | 19.62 | 129.6 | 129.0 | 10.54 | 11.05 |
163+
| 50 | 50 | 100 | 8.23 | 8.74 | 28.55 | 29.63 | 247.0 | 239.0 | 20.32 | 20.89 |
164+
165+
### Neural Search Performance
166+
167+
For neural search, we used the Quora dataset, containing over 500,000 documents. The table below shows query latency with and without MMR reranking:
168+
169+
| **k** | **Query size** | **MMR candidates** | **Neural (p50 ms)** | **Neural (p90 ms)** | **Neural + MMR (p50 ms)** | **Neural + MMR (p90 ms)** | **p50 Δ (%)** | **p90 Δ (%)** | **p50 Δ (ms)** | **p90 Δ (ms)** |
170+
| ----- | -------------- | ------------------ | ------------------- | ------------------- | ------------------------- | ------------------------- | ------------- | ------------- | -------------- | -------------- |
171+
| 1 | 1 | 1 | 113.59 | 122.22 | 113.08 | 122.38 | -0.46 | 0.13 | -0.52 | 0.16 |
172+
| 10 | 10 | 10 | 112.03 | 122.90 | 113.88 | 122.63 | 1.66 | -0.22 | 1.86 | -0.27 |
173+
| 10 | 10 | 30 | 112.03 | 122.90 | 119.57 | 127.65 | 6.73 | 3.86 | 7.54 | 4.75 |
174+
| 10 | 10 | 50 | 112.03 | 122.90 | 122.56 | 133.34 | 9.40 | 8.50 | 10.53 | 10.45 |
175+
| 10 | 10 | 100 | 112.03 | 122.90 | 130.52 | 139.95 | 16.51 | 13.87 | 18.49 | 17.05 |
176+
| 20 | 20 | 100 | 112.41 | 122.85 | 131.18 | 141.09 | 16.69 | 14.85 | 18.77 | 18.24 |
177+
| 50 | 50 | 100 | 114.86 | 121.02 | 141.24 | 152.42 | 22.97 | 25.94 | 26.38 | 31.40 |
178+
179+
### Key Observations
180+
181+
1. MMR adds latency, and the increase grows with the number of MMR candidates and the query size.
182+
2. KNN/Neural queries without MMR scale well with k. The dominant cost comes from graph traversal (ef_search), not selecting the top k candidates.
183+
184+
Choosing the number of MMR candidates requires balancing diversity versus query latency. More candidates improve result diversity but increase latency, so select values appropriate for your workload.
185+
186+
## Using MMR with Cross-cluster Search
187+
188+
Currently, for [cross-cluster search](https://docs.opensearch.org/latest/search-plugins/cross-cluster-search/), OpenSearch cannot automatically resolve vector field information from the index mapping in the remote clusters. This means users must explicitly provide the vector field details when using MMR.
189+
190+
Here’s an example query:
191+
192+
```json
193+
POST /my-index/_search
194+
{
195+
"query": {
196+
"neural": {
197+
"my_vector_field": {
198+
"query_text": "query text",
199+
"model_id": "<your model id>"
200+
}
201+
}
202+
},
203+
"ext": {
204+
"mmr": {
205+
"diversity": 0.5,
206+
"candidates": 10,
207+
"vector_field_path": "my_vector_field",
208+
"vector_field_data_type": "float",
209+
"vector_field_space_type": "l2"
210+
}
211+
}
212+
}
213+
214+
```
215+
216+
Explanation of MMR Parameters for Remote Clusters
217+
218+
**vector_field_path:** Path to the vector field to use for MMR re-ranking.
219+
220+
**vector_field_data_type:** Data type of the vector (e.g., float).
221+
222+
**vector_field_space_type:** Distance metric used for similarity calculations (e.g., l2).
223+
224+
candidates and diversity: Same as in local MMR queries, controlling the number of candidates and the diversity weight.
225+
226+
Providing this information ensures that MMR can correctly compute diversity and re-rank results even when querying across remote clusters.
227+
228+
## Summary
229+
230+
OpenSearch’s Maximal Marginal Relevance (MMR) feature makes it easy to deliver search results that are both relevant and diverse. By intelligently re-ranking results, MMR helps surface a wider variety of options, reduces redundancy, and creates a richer, more engaging search experience for your users.
231+
232+
If you’re looking to improve your vector search diversity, MMR in OpenSearch is a powerful tool to try today.
233+
234+
## What's Next
235+
236+
In the future, we can make MMR even easier and more flexible:
237+
238+
- **Better support for remote clusters:** removing the need to manually specify vector field info.
239+
- **Expanded query type support:** Currently we only can support knn query or neural query with knn_vector. Potentially we can support more query types. e.g. bool and hybrid queries, so MMR can enhance a wider variety of search scenarios.

0 commit comments

Comments
 (0)