Skip to content

Support multiple --text-group values in a single validation pass #40

@cmungall

Description

@cmungall

Feature request

Problem

When using validate text-file with a regex that has multiple text capture groups, only one can be validated per invocation. For structured reference assertions like:

- @ref PMID:32550677 "Multiple Acyl-CoA Dehydrogenase Deficiency" excerpt: "disorder of fatty acid and amino acid oxidation"

The regex @ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)" produces three groups:

  1. PMID number (reference)
  2. Title (text to validate)
  3. Excerpt (text to validate)

To validate both title and excerpt, you must run the command twice:

# Validate titles
uvx linkml-reference-validator validate text-file doc.md \
  --regex '@ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)"' \
  --ref-group 1 --text-group 2

# Validate excerpts
uvx linkml-reference-validator validate text-file doc.md \
  --regex '@ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)"' \
  --ref-group 1 --text-group 3

Proposed solution

Allow --text-group to accept multiple values:

uvx linkml-reference-validator validate text-file doc.md \
  --regex '@ref PMID:(\d+) "([^"]*)" excerpt: "([^"]*)"' \
  --ref-group 1 --text-group 2 --text-group 3

This would validate both capture groups against the reference in a single pass, reporting results for each group separately.

Nice-to-have

A warning when the regex has more capture groups than are referenced by --text-group and --ref-group — this helps catch off-by-one errors in group numbering (which I hit during initial use).

Use case

Validating structured reference annotations in markdown analysis documents where each citation includes both a title and a supporting text excerpt that should both be verified against the cited publication.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions