feat: add periodic refresh table jobs & refactor ProgressTracker #23737

tabVersion · 2025-11-11T07:40:54Z

Introduced a new migration for periodic refresh jobs, including the creation of the periodic_refresh_jobs table.
Implemented the GlobalRefreshManager to manage ongoing refresh processes and periodic refresh jobs.
Added functionality to initialize and trigger periodic refresh jobs based on defined intervals.
Updated relevant modules to support the new refresh management system.

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

following #23527

Key aspects of this change include:

Refactored Refresh Mode: The FULL_RECOMPUTE enum and related logic have been renamed and replaced with FULL_RELOAD, better reflecting the operation of reloading data from external sources.
Persistent Refresh Job State: A new database model refresh_job is introduced to store the state (e.g., IDLE, REFRESHING), last trigger time, and configured refresh interval for each refreshable table. This state is now persistent and managed by the meta service.
Scheduled Refreshes: The FULL_RELOAD mode now supports an optional refresh_interval_sec property, allowing users to configure tables to automatically reload their data at specified intervals.
User-facing Observability: A new system catalog, rw_catalog.rw_refresh_table_state, has been added. Users can query this table to view the current status, last trigger time, and configured interval for all refreshable tables.
Monitoring: New Grafana dashboards have been added to track metrics related to refresh job durations, finish rates, and cron job triggers/misses, providing deeper insights into the refresh process.
Integration: The GlobalRefreshManager is tightly integrated with the meta service's barrier management and catalog operations, ensuring consistent and controlled refreshes.

Checklist

I have written necessary rustdoc comments.
I have added necessary unit tests and integration tests.
I have added test labels as necessary.
I have added fuzzing tests or opened an issue to track them.
My PR contains breaking changes.
My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
I have checked the Release Timeline and Currently Supported Versions to determine which release branches I need to cherry-pick this PR into.

Documentation

My PR needs documentation updates.

Release note

PLEASE MARK AS EXPERIMENTAL

Now, with FULL_RELOAD, you gain:

Scheduled Refreshes: Configure refreshable tables to automatically reload their data at specified intervals.
Persistent Job State: RisingWave now keeps track of the refresh status, last trigger time, and interval for each FULL_RELOAD table, even across restarts.
Enhanced Observability: A new system catalog rw_catalog.rw_refresh_table_state allows you to monitor the status of your refresh jobs.

This means you can set up data sources (like Iceberg tables) to be periodically reloaded into RisingWave, ensuring your views and queries are always up-to-date with the latest batch data, without manual intervention.

How it Works

When creating a refreshable table, you can now specify a FULL_RELOAD mode with an optional refresh_interval_sec property:

CREATE TABLE iceberg_batch_table (
    id int primary key,
    name varchar
) WITH (
    connector = 'iceberg',
    catalog.type = 'storage',
    table.name = 'my_iceberg_table',
    database.name = 'public',
    refresh_mode = 'FULL_RELOAD', -- MUST set to `FULL_RELOAD`
    refresh_interval_sec = '60' -- reload the table every 60s
);

still can manually refresh the table

REFRESH TABLE iceberg_batch_table;

the status can be queried with sql

SELECT table_id, current_status, last_trigger_time, trigger_interval_secs
FROM rw_catalog.rw_refresh_table_state;

Copilot

Pull Request Overview

This PR introduces infrastructure for periodic refresh table jobs and refactors the progress tracking mechanism from a static global singleton to an instance-based approach managed by a new GlobalRefreshManager.

Added a new periodic_refresh_jobs database table to track periodic refresh schedules and status
Introduced GlobalRefreshManager to centralize refresh process management and periodic job scheduling
Refactored REFRESH_TABLE_PROGRESS_TRACKER from a static LazyLock<Mutex<>> to an instance-based Arc<RwLock<>> owned by GlobalRefreshManager

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 9 comments.

Show a summary per file

File	Description
src/meta/src/stream/refresh_manager.rs	Core implementation of `GlobalRefreshManager` with periodic refresh scheduling, job registration, and refactored progress tracking from static global to instance-based
src/meta/src/barrier/worker.rs	Updated to accept and pass `GlobalRefreshManagerRef` to barrier worker context
src/meta/src/barrier/manager.rs	Updated to accept and pass `GlobalRefreshManagerRef` to barrier manager
src/meta/src/barrier/context/recovery.rs	Updated to use instance-based progress tracker instead of static global
src/meta/src/barrier/context/mod.rs	Added `global_refresh_manager` field to worker context struct
src/meta/src/barrier/context/context_impl.rs	Updated all progress tracker accesses to use instance-based tracker from `global_refresh_manager`
src/meta/service/src/stream_service.rs	Added `global_refresh_manager` parameter to stream service and passed to `RefreshManager`
src/meta/node/src/server.rs	Initialized `GlobalRefreshManager`, started periodic refresh loop, and wired it through the service stack (contains duplicate code blocks that need cleanup)
src/meta/model/src/periodic_refresh_job.rs	New entity model for periodic refresh jobs table
src/meta/model/src/lib.rs	Added periodic_refresh_job module export
src/meta/model/migration/src/m20251110_224156_periodic_refresh_jobs.rs	New migration to create periodic_refresh_jobs table
src/meta/model/migration/src/lib.rs	Registered new periodic refresh jobs migration

src/meta/src/stream/refresh_manager.rs

src/meta/node/src/server.rs

src/meta/model/src/periodic_refresh_job.rs

src/meta/src/stream/refresh_manager.rs

src/meta/node/src/server.rs

src/meta/src/stream/refresh_manager.rs

- Introduced `RefreshProgressTracker` to manage progress across multiple actors during refresh operations, preventing race conditions. - Updated data structures to track per-actor progress for list and load phases. - Added new `RefreshProgress` protobuf message for communication. - Enhanced `BarrierCompleteResult` to include refresh progress data. - Integrated the tracker into `DatabaseCheckpointControl` and updated related components for compatibility. - Added migration for new refresh job table and related functionality. Next steps include integrating the tracker with barrier checkpoint control and updating RPC call sites to handle refresh progress.

Copilot

Pull Request Overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 4 comments.

src/meta/src/controller/catalog/alter_op.rs

src/meta/src/barrier/context/context_impl.rs

src/meta/src/stream/refresh_manager.rs

src/meta/model/migration/src/m20251030_120000_refresh_jobs.rs

- Updated SLT queries to include retry logic with backoff for improved reliability. - Removed redundant logging in `context_impl.rs` after table refresh completion. - Enhanced error handling in `alter_op.rs` for refresh job insertion, logging when a job already exists. - Added logging for table refresh completion in `refresh_manager.rs`. - Changed logging level from info to debug in `materialize.rs` for progress tracking.

- Removed the `refresh_state` field from the `Table` message in `catalog.proto` and related code. - Updated `TableCatalog` and other components to eliminate references to the removed `refresh_state`. - Refactored the `RefreshState` enum and its usage across the codebase to streamline refresh job management. - Adjusted migration files to reflect the removal of the `source_refresh_mode` migration. - Enhanced the refresh job status handling in various modules to ensure consistency.

- Introduced `ListRefreshTableStatesRequest` and `ListRefreshTableStatesResponse` messages in `meta.proto` to facilitate querying refresh job states. - Implemented the `list_refresh_table_states` method in the `FrontendMetaClient` and `StreamManagerService` to handle the new RPC. - Created `RwRefreshTableState` struct to represent the state of refresh jobs in the system catalog. - Updated migration files to accommodate changes in refresh job handling. - Enhanced the `MetaClient` to support the new RPC call for listing refresh table states.

…tures - Added a new migration for `source_refresh_mode` to enhance refresh job capabilities. - Updated `RefreshJob` and `RwRefreshTableState` structures to remove deprecated fields and accommodate new logic. - Modified the `StreamManagerService` to handle timestamp conversions for last trigger times. - Enhanced the `GlobalRefreshManager` to streamline refresh job management and ensure proper state handling. - Included `chrono` dependency for improved date and time handling across the codebase.

chenzl25 · 2025-11-18T05:17:51Z

@tabVersion This is a user-facing feature, so please add a release note to describe this feature and provide an example to illustrate how to use it, thanks.

hzxa21 · 2025-11-18T07:23:10Z

@tabVersion This is a user-facing feature, so please add a release note to describe this feature and provide an example to illustrate how to use it, thanks.

Please also mention that this is an experimental feature for doc team's reference.

hzxa21 · 2025-11-18T07:29:32Z

proto/plan_common.proto

 message SourceRefreshMode {
  message SourceRefreshModeStreaming {}
-  message SourceRefreshModeFullRecompute {}
+  message SourceRefreshModeFullRecompute {


User facing question: after this PR, user specifies refresh_mode = 'FULL_RECOMPUTE' in order to use the refreshable batch source table feature. Personally I feel FULL_RECOMPUTE can mislead user to think that there will be full recomputation on the MV instead of just on the table to calculate the diff.

I am thinking whether we should call it refresh_mode = 'SNAPSHOT_DIFF' instead. WDYT? @tabVersion @chenzl25

We borrow the same name from here https://docs.databricks.com/aws/en/optimizations/incremental-refresh#determine-the-refresh-type-of-an-update I think it is fine, because that's how to refresh this table.snapshot diff is more like something we generate it to the downstream.

had an offline discussion, the name finalized to FULL_RELOAD

…ation - Added a new `refresh_manager.py` file to define panels for monitoring refresh job metrics in the RisingWave Dev Dashboard. - Updated the `MetaMetrics` struct to include metrics for refresh job duration, finish count, cron job triggers, and misses. - Enhanced the `GlobalRefreshManager` to track and report refresh job metrics, including success and failure statuses. - Modified the `remove_progress_tracker` method to log metrics upon job completion or failure. - Updated the dashboard JSON files to reflect the new refresh manager panels.

…ELOAD' - Changed references in multiple files to reflect the updated refresh mode terminology. - Adjusted error messages and logic to ensure consistency with the new naming convention. - Updated related tests and utility functions to align with the changes.

github-actions · 2025-11-24T09:00:43Z

✅ Cherry-pick PRs (or issues if encountered conflicts) have been created successfully to all target branches.

) Co-authored-by: tab <[email protected]>

tabVersion requested a review from Copilot November 11, 2025 07:40

github-actions bot added type/feature Type: New feature. ci/run-backwards-compat-tests Run backwards compatibility tests in your PR. ci/run-e2e-single-node-tests ci/run-e2e-test-other-backends labels Nov 11, 2025

Copilot started reviewing on behalf of tabVersion November 11, 2025 07:41 View session

Copilot finished reviewing on behalf of tabVersion November 11, 2025 07:44

Copilot AI reviewed Nov 11, 2025

View reviewed changes

tabVersion force-pushed the tab/refactor-tracker-2 branch from 0b6a3a8 to 682af22 Compare November 11, 2025 09:40

tabVersion requested a review from Copilot November 11, 2025 09:41

Copilot started reviewing on behalf of tabVersion November 11, 2025 09:41 View session

Copilot finished reviewing on behalf of tabVersion November 11, 2025 09:42

Copilot AI reviewed Nov 11, 2025

View reviewed changes

chenzl25 added the block-release-v2.7 label Nov 12, 2025

tabVersion force-pushed the tab/refactor-tracker-2 branch from b0ac399 to 837a81d Compare November 12, 2025 07:42

github-actions bot added the ci/run-e2e-iceberg-tests label Nov 12, 2025

tab added 6 commits November 12, 2025 16:36

Merge remote-tracking branch 'origin' into tab/refactor-tracker-2

07bf149

fix

fcd5247

fix slt

94a6a81

tabVersion marked this pull request as ready for review November 12, 2025 16:48

tabVersion requested a review from a team as a code owner November 12, 2025 16:48

tabVersion requested review from MrCroxx, chenzl25 and hzxa21 and removed request for a team November 12, 2025 16:48

hzxa21 reviewed Nov 18, 2025

View reviewed changes

tab added 3 commits November 18, 2025 15:56

fix metrics

fd91964

github-actions bot added ci/run-s3-source-tests ci/main-cron/run-selected labels Nov 18, 2025

tabVersion enabled auto-merge November 18, 2025 08:32

tab and others added 7 commits November 18, 2025 17:24

Merge remote-tracking branch 'origin' into tab/refactor-tracker-2

a93e705

fix

e4ed8f7

fix ci

419d8ce

Merge branch 'main' into tab/refactor-tracker-2

f953f62

logging

31b7b1d

Merge remote-tracking branch 'origin' into tab/refactor-tracker-2

80ba362

ci settings

0f8b12c

tabVersion added the need-cherry-pick-since-release-2.7 label Nov 24, 2025

tab added 4 commits November 24, 2025 15:05

fix: update stream refresh scheduler interval in CI configuration

c6b88c4

Merge remote-tracking branch 'origin' into tab/refactor-tracker-2

22780c7

fix(iceberg): add retry logic to query in refresh_iceberg_table.slt

3d884da

Merge remote-tracking branch 'origin' into tab/refactor-tracker-2

b8026e6

tabVersion added this pull request to the merge queue Nov 24, 2025

Merged via the queue into main with commit f2fb5f9 Nov 24, 2025
38 of 45 checks passed

tabVersion deleted the tab/refactor-tracker-2 branch November 24, 2025 08:59

WanYixian mentioned this pull request Nov 24, 2025

Document: feat: add periodic refresh table jobs & refactor ProgressTracker risingwavelabs/risingwave-docs#815

Open

risingwave-ci mentioned this pull request Nov 24, 2025

cherry-pick feat: add periodic refresh table jobs & refactor ProgressTracker (#23737) to branch release-2.7 #23823

Open

tabVersion added a commit that referenced this pull request Nov 24, 2025

feat: add periodic refresh table jobs & refactor ProgressTracker (#23737

a7af93d

) Co-authored-by: tab <[email protected]>

Copilot AI mentioned this pull request Nov 24, 2025

docs: Add periodic refresh table jobs documentation (experimental) risingwavelabs/risingwave-docs#816

Draft

3 tasks

feat: add periodic refresh table jobs & refactor ProgressTracker #23737

feat: add periodic refresh table jobs & refactor ProgressTracker #23737

Uh oh!

Conversation

tabVersion commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's changed and what's your intention?

Checklist

Documentation

How it Works

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

chenzl25 commented Nov 18, 2025

Uh oh!

hzxa21 commented Nov 18, 2025

Uh oh!

hzxa21 Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

chenzl25 Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

tabVersion Nov 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Nov 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tabVersion commented Nov 11, 2025 •

edited

Loading