feat(examples): add custom benchmark examples for BEAM by egordm · Pull Request #821 · OpenSTEF/openstef

egordm · 2026-03-04T10:43:40Z

Summary

Adds end-to-end examples showing how to create and run custom BEAM benchmarks, plus a Windows compatibility fix.

Custom Benchmark Examples

File	Purpose
`example_baseline.py`	Minimal forecaster implementing `BacktestForecasterMixin` — predicts a constant median
`example_benchmark.py`	Custom target provider extending `SimpleTargetProvider` with metrics and pipeline assembly
`run_liander2024_benchmark.py`	Runs the example baseline + GBLinear on the built-in Liander 2024 dataset
`run_benchmark.py`	Runs the example baseline + GBLinear using the custom benchmark pipeline
`README.md`	Beginner-friendly guide with setup, usage, and "create your own" instructions

All files include inline comments explaining concepts for newcomers (quantiles, RestrictedHorizonVersionedTimeSeries, config parameters, metrics, etc.).

Windows Compatibility Fix

AvailableAt.__str__() now produces D-1T0600 instead of D-1T06:00. Colons are illegal in Windows file/directory names, which broke benchmark output paths like benchmark_results/analysis/mv_feeder/OS Edam/D-1T06:00/.

from_string() accepts both D-1T06:00 (legacy) and D-1T0600 (new) for backward compatibility
All existing unit tests pass (core + beam)

…mments - example_baseline.py: minimal forecaster (constant median) implementing BacktestForecasterMixin - example_benchmark.py: custom target provider extending SimpleTargetProvider - run_liander2024_benchmark.py: run example baseline + GBLinear on Liander 2024 - run_benchmark.py: run example baseline + GBLinear using custom benchmark pipeline - README.md: beginner-friendly guide with setup and usage instructions

Change DnTHH:MM format to DnTHHMM (e.g. D-1T0600 instead of D-1T06:00). Colons are illegal in Windows file paths, breaking benchmark output directories. from_string() now accepts both formats for backward compatibility.

Allows users to inject their own forecast predictions and run only evaluation + analysis, skipping the backtesting step entirely. Includes format_predictions() helper and a stub forecaster factory.

…erConfig - Correct import paths for BacktestForecasterMixin and RestrictedHorizonVersionedTimeSeries - Add missing required fields to BacktestForecasterConfig in stub - Verified: pipeline correctly skips backtesting and runs eval + analysis

… data format - Rename evaluate_forecasts.py → evaluate_existing_forecasts.py - Remove format_predictions() helper and save loop (step 1) - Just point LocalBenchmarkStorage at existing parquets - Use DummyForecaster instead of custom _QuantileStub - Add README section: directory layout, parquet format table, example rows - Add LeadTime to evaluation config in example_benchmark.py

Add compare_liander2024_results.py and compare_custom_results.py that generate side-by-side comparison plots across benchmark runs using BenchmarkComparisonPipeline. Update README with comparison section.

…le-custom-baseline-benchmark

sonarqubecloud · 2026-03-06T08:15:23Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

egordm changed the base branch from main to release/v4.0.0 March 4, 2026 10:44

egordm added 5 commits March 4, 2026 12:01

fix: make AvailableAt.__str__() Windows-safe by removing colon

68616c2

Change DnTHH:MM format to DnTHHMM (e.g. D-1T0600 instead of D-1T06:00). Colons are illegal in Windows file paths, breaking benchmark output directories. from_string() now accepts both formats for backward compatibility.

feat(examples): add evaluate_forecasts.py for pre-existing predictions

b8960ac

Allows users to inject their own forecast predictions and run only evaluation + analysis, skipping the backtesting step entirely. Includes format_predictions() helper and a stub forecaster factory.

feat(examples): add comparison scripts for benchmark results

b0481ef

Add compare_liander2024_results.py and compare_custom_results.py that generate side-by-side comparison plots across benchmark runs using BenchmarkComparisonPipeline. Update README with comparison section.

egordm mentioned this pull request Mar 5, 2026

feat(STEF-2702): openstef-meta cleanup & release pipeline fixes #822

Merged

Merge remote-tracking branch 'origin/release/v4.0.0' into chore/examp…

faed530

…le-custom-baseline-benchmark

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(examples): add custom benchmark examples for BEAM#821

feat(examples): add custom benchmark examples for BEAM#821
egordm wants to merge 7 commits intorelease/v4.0.0from
chore/example-custom-baseline-benchmark

egordm commented Mar 4, 2026 •

edited

Loading

Uh oh!

sonarqubecloud bot commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

egordm commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Custom Benchmark Examples

Windows Compatibility Fix

Uh oh!

sonarqubecloud bot commented Mar 6, 2026

Quality Gate passed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

egordm commented Mar 4, 2026 •

edited

Loading