Skip to content

Conversation

@patrick-austin
Copy link
Contributor

When indexing, we were using a non-synonym version of the analyzer, but when searching synonyms were being injected. When doing basic OR logic queries (e.g. path to mr file) this is OK. The search term gets molecular replac injected, which isn't present in the Document but this is OK as it's an OR query.

When doing an phrase query (the most common use case for this is to quote an exact filepath, which will remove the / characters easily) each word has to match in order. At this point, there is no longer a match for the injected terms.

By using IcatSynonymAnalyzer for both, we can ensure that the injected term appear in both the search query and in the indexed Document we match against.

@patrick-austin patrick-austin requested a review from ajkyffin April 14, 2025 11:03
Copy link
Contributor

@ajkyffin ajkyffin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a horrible thing to do (processing a file path for synonyms) but if it's the only way to get it to work then fine.

@patrick-austin
Copy link
Contributor Author

I think you could use the add method instead of parse(reader), which would give you some freedom on where to source the synonym mappings. Ideally I wanted the synonyms to be configurable (e.g. if PaNET expands its ontology, I don't want that to be a source code change) but in reality I don't think DLS will really care enough to tailor the synonyms. If there's a more elegant way of achieving this you can think of, happy to do that in a future change.

@patrick-austin patrick-austin merged commit 4de231c into master Apr 16, 2025
2 checks passed
@ajkyffin ajkyffin deleted the use_synonymAnalyzer_to_write branch May 1, 2025 10:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants