Skip to content

Conversation

@augustlakia
Copy link

Summary

Adds support for configurable chrF metric parameters (e.g., char_order, word_order) in task YAML files, enabling
users to specify chrF++ and other chrF variants directly in their task configurations.

Problem

The chrF metric implementation in lm_eval/api/metrics.py calls sacrebleu.corpus_chrf() with hardcoded default
parameters (character order 6 and word order 0). This prevented users from configuring chrF++ (which uses word order
2) or other chrF variants through task YAML files.

def corpus_chrf(hypotheses: Sequence[str],
                references: Sequence[Sequence[str]],
                char_order: int = CHRF.CHAR_ORDER,
                word_order: int = CHRF.WORD_ORDER,
                beta: int = CHRF.BETA,
                remove_whitespace: bool = True,
                eps_smoothing: bool = False) -> CHRFScore:

Solution

  • Added metric_kwargs parameter to stderr_for_metric() function in lm_eval/api/metrics.py
  • Updated calculate_aggregate_metric() in lm_eval/evaluator_utils.py to pass metric_kwargs to stderr calculation
  • All bootstrap iterations now correctly use configured parameters from task YAML files
  • Users can now configure any of the following parameters in their task YAML files, available chrF Parameters:
    char_order
    word_order
    beta
    remove_whitespace
    eps_smoothing

Testing

Tested with a chrF++ configuration (word_order=2) in a task YAML file:

  metric_list:
    - metric: chrf
      char_order: 6
      word_order: 2

Verified that both main aggregation and all bootstrap stderr iterations correctly use word_order=2 instead of
reverting to the default word_order=0.

#2256

@CLAassistant
Copy link

CLAassistant commented Oct 23, 2025

CLA assistant check
All committers have signed the CLA.

@augustlakia augustlakia marked this pull request as ready for review October 23, 2025 07:03
@augustlakia augustlakia requested a review from baberabb as a code owner October 23, 2025 07:03
@baberabb
Copy link
Contributor

Hi! Thanks for the PR. LGTM, but I want to test it out before merging, so feel free to ping me if not in the next couple of days!

@augustlakia
Copy link
Author

Hey @baberabb, just checking in, did you have a chance to test it out yet?

@augustlakia
Copy link
Author

Hi! Thanks for the PR. LGTM, but I want to test it out before merging, so feel free to ping me if not in the next couple of days!

Hi! Just pinging you @baberabb

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants