Skip to content

Weekly retrain pipeline + hourly accuracy tracking#3

Open
rupeshbharambe24 wants to merge 1 commit into
mainfrom
feat/weekly-pipeline-and-hourly-accuracy
Open

Weekly retrain pipeline + hourly accuracy tracking#3
rupeshbharambe24 wants to merge 1 commit into
mainfrom
feat/weekly-pipeline-and-hourly-accuracy

Conversation

@rupeshbharambe24

Copy link
Copy Markdown
Owner

Summary

  • One-click weekly training (train_weekly.bat) that scrapes, retrains all 3 models, predicts the next 7 days, commits and pushes
  • New GH Actions workflow daily_predict.yml running daily 00:30 UTC for predict + log only (no training — fits the free tier)
  • New prediction_log_hourly table + /api/v1/dashboard/hourly-accuracy endpoint for the 6-month org evaluation
  • Admin dashboard button to trigger the GH workflow on demand

Training results (this run, full data through 2026-06-09)

Resolution Model CV MAPE Holdout MAPE Size
5-min LightGBM 0.18% 0.15% 0.9997 8.1 MB
Hourly XGBoost 0.52% 22 MB
Daily LightGBM 2.70% 3 MB

Data refresh

  • Backfilled 63 days of missing demand (Apr 8 → Jun 9). Demand table: 202,734 → 220,559 rows.
  • Predictions logged for next 7 days at hourly granularity (192 rows).

Operating cadence

  • Weekly (Friday 23:00 IST): double-click train_weekly.bat — full retrain, ~15-20 min, auto-commits and pushes new models
  • Daily (06:00 IST): GH Actions runs --no-train mode — scrape + predict + log

Setup required after merge

  1. On Render, add env var GITHUB_PAT (a personal access token with workflow scope) to enable the admin Run-Pipeline button
  2. Optional: UptimeRobot ping on /api/v1/health/ready every 5 min to keep Render warm

Test plan

  • 50/50 pytest suite passes
  • Pipeline runs locally end-to-end (scrape → train → predict → log)
  • Verify daily_predict.yml runs successfully in GH Actions (after merge)
  • Verify admin trigger button works after GITHUB_PAT is set on Render

- scripts/run_full_pipeline.py: single entrypoint that scrapes, retrains
  all 3 models, predicts next 7 days, and logs each hour. Used by both
  train_weekly.bat (local Friday) and daily_predict.yml (GH Actions).

- train_weekly.bat: Windows one-click wrapper that pulls main, runs the
  pipeline, then commits and pushes the new models. Designed to be
  triggered weekly via a Friday 23:00 IST calendar reminder.

- .github/workflows/daily_predict.yml: predict-only workflow (no
  training) that runs daily at 00:30 UTC. Replaces scheduler.yml which
  is now deprecated to workflow_dispatch only.

- prediction_log_hourly table: 24 rows per (date, model) capturing
  predicted vs actual at hourly granularity. Powers a 6-month
  org-readable evaluation dashboard.

- /api/v1/dashboard/hourly-accuracy: returns per-hour entries plus a
  hour-of-day rollup of MAPE/MAE/interval coverage.

- /api/v1/admin/trigger-pipeline: kicks the GH Actions workflow via
  workflow_dispatch (needs GITHUB_PAT env var on Render). Wired to a
  'Run Daily Pipeline' button in the admin page header.

- Refreshed daily and hourly champions on full data through 2026-06-09.
  Reported CV MAPE: 5-min 0.18%, hourly (XGBoost) 0.52%, daily 2.70%.
@vercel

vercel Bot commented Jun 10, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gridalytics Ready Ready Preview, Comment Jun 10, 2026 7:22pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant