Skip to content

fix: improve leaderboard scoring and fix aube package count#95

Merged
darcyclarke merged 1 commit intomainfrom
darcy/fix-scoring
Apr 20, 2026
Merged

fix: improve leaderboard scoring and fix aube package count#95
darcyclarke merged 1 commit intomainfrom
darcy/fix-scoring

Conversation

@vltbaudbot
Copy link
Copy Markdown
Contributor

Changes

Leaderboard scoring fixes

The leaderboard currently ranks by average time, which is unfair — PMs that DNF on most tests get artificially low averages from only their easy wins.

DNF penalty (both views):

  • Previously, DNFs were excluded from averages — a PM that fails 58% of tests gets a cherry-picked low average
  • Now, when a PM has a DNF for a fixture, it's assigned the slowest successful time among all PMs for that same fixture (1:1 penalty, not 2x)
  • DNF PMs cannot win a fixture — only successful PMs compete for wins
  • If ALL PMs DNF for a fixture, it's skipped entirely

Average view (default leaderboard):

  • Sort by wins first (most wins = feat(nx-cloud): set up nx workspace #1), then average time as tiebreaker
  • Ensures PMs that consistently win across many tests rank higher than PMs that only complete easy ones

Specific variant views:

  • Keep sorting by average time (lower is better), with wins as tiebreaker

Aube package count fix

Aube wasn't showing package counts in the benchmarks because:

  • aube uses symlinks inside node_modules/.aube/ (unlike pnpm which uses hard links inside .pnpm/)
  • The standard find -type f cannot traverse symlinks, so it found 0 packages
  • Fix: detect node_modules/.aube/ and use find -L scoped to that directory to follow symlinks, then deduplicate unique package names

Co-authored-by: Darcy Clarke darcy@darcyclarke.com

Leaderboard scoring:
- DNF penalty: assign slowest successful time (1:1) instead of
  excluding DNFs from averages, so PMs that fail tests can't
  cherry-pick artificially low averages
- Average view: sort by wins first (most wins = #1), then average
  time as tiebreaker — ensures PMs that consistently win across
  many tests rank higher
- Specific variant views: keep sorting by average time with wins
  as tiebreaker
- DNF PMs cannot win a fixture (only successful PMs compete for wins)
- Skip fixture entirely if ALL PMs DNF

Aube package count:
- aube uses symlinks inside node_modules/.aube/ (unlike pnpm which
  uses hard links inside .pnpm/), so find -type f cannot traverse them
- Detect node_modules/.aube/ and use find -L scoped to that directory
  to follow symlinks, then deduplicate unique package names via sed

Co-authored-by: Darcy Clarke <darcy@darcyclarke.com>
@darcyclarke darcyclarke merged commit e1a6f3e into main Apr 20, 2026
48 checks passed
@darcyclarke darcyclarke deleted the darcy/fix-scoring branch April 20, 2026 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants