fix: block dangerous Elasticsearch query types in MCP searchWithDirectQuery#27730
fix: block dangerous Elasticsearch query types in MCP searchWithDirectQuery#27730JasonOA888 wants to merge 2 commits intoopen-metadata:mainfrom
Conversation
…tQuery The MCP searchMetadata tool accepts a user-controlled queryFilter that is passed directly to Elasticsearch via Query.of(q -> q.withJson(...)). Without validation, an authenticated MCP user could craft queries using script, script_score, wrapper, percolator, or scripted_metric types — enabling arbitrary code execution (painless/groovy), query obfuscation (wrapper base64), or denial-of-service. Add validateQuerySafety() which recursively walks the parsed JSON query tree and rejects any node containing a blocked query type key. This check runs before the query reaches Elasticsearch, ensuring dangerous types are blocked regardless of nesting depth (e.g., inside bool clauses). Signed-off-by: Jason L <jason@outland.art>
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Address review feedback: - Add function_score (allows script injection via scoring functions) - Add runtime_mappings (allows Painless scripts at query time) - Fix duplicate import java.util.Map - Add tests for both new blocked types Reported-by: gitar-bot[bot]
|
Hi there 👋 Thanks for your contribution! The OpenMetadata team will review the PR shortly! Once it has been labeled as Let us know if you need any help! |
Code Review ✅ Approved 2 resolved / 2 findingsRestricts ✅ 2 resolved✅ Security: Blocklist missing
|
| Compact |
|
Was this helpful? React with 👍 / 👎 | Gitar
Summary
The MCP
searchMetadatatool accepts a user-controlledqueryFilterthat is parsed as raw JSON and passed directly to Elasticsearch viaQuery.of(q -> q.withJson(...)). While RBAC conditions are applied as filters afterward, the user query itself is not validated for dangerous query types.This means an authenticated MCP user could craft queries using:
script— execute arbitrary Painless/Groovy scripts server-sidescript_score— script-based scoring with arbitrary codewrapper— base64-encoded queries that can hide nested dangerous typespercolator— reverse query matchingscripted_metric— script-based aggregation with init/map/combine/reduce scriptsFix
Add
validateQuerySafety()which recursively walks the parsed JSON query tree and rejects any node containing a blocked query type key: