Feature request
Problem
When using validate text-file with a regex that has multiple text capture groups, only one can be validated per invocation. For structured reference assertions like:
- @ref PMID:32550677 "Multiple Acyl-CoA Dehydrogenase Deficiency" excerpt: "disorder of fatty acid and amino acid oxidation"
The regex @ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)" produces three groups:
- PMID number (reference)
- Title (text to validate)
- Excerpt (text to validate)
To validate both title and excerpt, you must run the command twice:
# Validate titles
uvx linkml-reference-validator validate text-file doc.md \
--regex '@ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)"' \
--ref-group 1 --text-group 2
# Validate excerpts
uvx linkml-reference-validator validate text-file doc.md \
--regex '@ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)"' \
--ref-group 1 --text-group 3
Proposed solution
Allow --text-group to accept multiple values:
uvx linkml-reference-validator validate text-file doc.md \
--regex '@ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)"' \
--ref-group 1 --text-group 2 --text-group 3
This would validate both capture groups against the reference in a single pass, reporting results for each group separately.
Nice-to-have
A warning when the regex has more capture groups than are referenced by --text-group and --ref-group — this helps catch off-by-one errors in group numbering (which I hit during initial use).
Use case
Validating structured reference annotations in markdown analysis documents where each citation includes both a title and a supporting text excerpt that should both be verified against the cited publication.
Feature request
Problem
When using
validate text-filewith a regex that has multiple text capture groups, only one can be validated per invocation. For structured reference assertions like:The regex
@ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)"produces three groups:To validate both title and excerpt, you must run the command twice:
Proposed solution
Allow
--text-groupto accept multiple values:uvx linkml-reference-validator validate text-file doc.md \ --regex '@ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)"' \ --ref-group 1 --text-group 2 --text-group 3This would validate both capture groups against the reference in a single pass, reporting results for each group separately.
Nice-to-have
A warning when the regex has more capture groups than are referenced by
--text-groupand--ref-group— this helps catch off-by-one errors in group numbering (which I hit during initial use).Use case
Validating structured reference annotations in markdown analysis documents where each citation includes both a title and a supporting text excerpt that should both be verified against the cited publication.