Skip to content

feat(tools): add legacy data migration subcommand#173

Closed
Vui-Chee wants to merge 1 commit intomainfrom
devin/1772690235-legacy-migrate-tool
Closed

feat(tools): add legacy data migration subcommand#173
Vui-Chee wants to merge 1 commit intomainfrom
devin/1772690235-legacy-migrate-tool

Conversation

@Vui-Chee
Copy link
Contributor

@Vui-Chee Vui-Chee commented Mar 5, 2026

Description

Ports the legacy data migration tool from PR #137 into the existing bin/tools binary as a new migrate subcommand, rather than adding a separate binary.

The migrate command moves data from legacy MDBX-only storage into the new RocksDB + static files layout:

  • Static files: TransactionSenders, AccountChangeSets, Receipts
  • RocksDB: TransactionHashNumbers, AccountsHistory, StoragesHistory

Static file and RocksDB migrations run in parallel via std::thread::scope, and segments within each are parallelized with rayon.

New files

  • bin/tools/migrate.rs — core migration logic for static file segments and RocksDB tables
  • bin/tools/migrate_command.rs — CLI command struct (--batch-size, --skip-static-files, --skip-rocksdb, --keep-mdbx)
  • bin/tools/progress.rs — progress logging with ETA

CLI flags

Flag Default Description
--batch-size 10000 RocksDB write batch size
--skip-static-files false Skip static file migration
--skip-rocksdb false Skip RocksDB migration
--keep-mdbx false Don't clear migrated MDBX tables after migration

Notable API adaptations from PR #137

The original PR targeted an older reth revision. Key changes made during porting:

  • batch_with_auto_commit()batch() (API no longer exists)
  • Removed set_expected_block_start() — block start is set at construction via get_writer(genesis_block, segment)
  • StorageSettings::base()StorageSettings::legacy()
  • Removed StorageChangeSets segment (no longer exists in this reth revision)

Dependency note

Enabled reth-stages with rocksdb feature as a workaround for upstream feature unification issue. Enabling only reth-provider/rocksdb causes compilation error in reth-stages (upstream bug at this reth revision). reth-stages is not actually imported/used by this code — it's only added to resolve feature flags.

Type of Change

  • New feature (non-breaking change which adds functionality)

Code Guidelines

Relevant guidelines:

Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works ⚠️
  • New and existing unit tests pass locally with my changes ⚠️

Testing

⚠️ Compilation verified with cargo c -p xlayer-reth-tools — passes successfully.

⚠️ Full test suite not runjust t timed out during workspace-wide compilation. No runtime testing performed. The migration logic is adapted from PR #137 but has not been validated on the current codebase.

Additional Notes

⚠️ Review priorities

  1. Dependency workaround: The reth-stages with rocksdb feature is only added to fix feature unification, not because the tools binary uses it. Is this acceptable?

  2. Batch API semantics: Changed from batch_with_auto_commit() to batch(). Verify this is correct — the code manually calls batch.commit() after every batch_size operations. Does batch() require explicit commit, or auto-commit on drop?

  3. Default MDBX clearing behavior: By default (--keep-mdbx NOT set), the tool clears the migrated MDBX tables after migration. This is destructive and irreversible. Users must opt-in to keep MDBX data. Is this the desired UX?

  4. No checkpoint/resume for RocksDB migration: RocksDB table migrations start from scratch each time (walk from None). Static file migration is resumable via get_highest_static_file_block, but RocksDB migration is all-or-nothing per table. Is this acceptable?

  5. Complex type for Receipts table: Line 193 uses a turbofish with nested associated types:

    tx.clear::<tables::Receipts<<<N as reth_node_builder::NodeTypes>::Primitives as reth_primitives_traits::NodePrimitives>::Receipt>>()?;

    This is fragile. Verify it compiles and is correct.

  6. No tests added: The migration logic has no unit tests. Runtime behavior is untested.

Related


Open with Devin

Port legacy data migration tool from PR #137 into existing bin/tools
as a new 'migrate' subcommand. Migrates data from MDBX to RocksDB +
static files for TransactionSenders, AccountChangeSets, Receipts,
TransactionHashNumbers, AccountsHistory, and StoragesHistory tables.

Key changes:
- Add migrate.rs: core migration logic for static files and RocksDB
- Add migrate_command.rs: CLI command with options (batch-size,
  skip-static-files, skip-rocksdb, keep-mdbx)
- Add progress.rs: progress logging utilities
- Update main.rs: add Migrate variant to Commands enum
- Update Cargo.toml: add reth-provider/rocksdb, reth-stages/rocksdb,
  reth-static-file-types, reth-storage-api dependencies
- Update .reth-dev.toml: add reth-stages and reth-static-file-types
  entries for local dev builds

Co-Authored-By: vuicheesiew@gmail.com <vuicheesiew@gmail.com>
@KamiD
Copy link
Member

KamiD commented Mar 11, 2026

@claude please review it

@KamiD
Copy link
Member

KamiD commented Mar 11, 2026

@claude please review

@KamiD
Copy link
Member

KamiD commented Mar 11, 2026

@claude review

@JimmyShi22
Copy link
Contributor

@review

@github-actions
Copy link
Contributor

github-actions bot commented Mar 12, 2026

Claude finished @JimmyShi22's task in 2m 43s —— View job


Code Review — feat(tools): add legacy data migration subcommand

  • Read repository files and changed code
  • Analyzed code quality and correctness
  • Posted review findings

Summary

The PR ports the legacy data migration tool from PR #137 into the existing bin/tools binary as a new migrate subcommand. The implementation is functional in structure, but there are several correctness concerns, design issues, and code quality problems that should be addressed before merging — particularly since this is a destructive, one-way operation that clears MDBX tables by default.


Critical / High Issues

1. AccountChangeSets writer gap-fill is conditional on a state that may not be set

bin/tools/migrate.rs:123-125

if writer.current_block_number().is_some() {
    writer.ensure_at_block(block.saturating_sub(1))?;
}

If the writer is brand-new (no prior data), current_block_number() may return None, causing the ensure_at_block to be skipped. This means blocks between start and the first entry in the cursor result that have no account changesets will not be padded, potentially creating a static file with incorrect block offsets. An early-block chain (where blocks 1..N have no state changes) could silently produce a corrupted file. This needs to be verified against the static file writer contract — if a new writer requires all blocks to be explicitly padded from the start, this is a bug.

Fix this →


2. blocks_processed not updated after loop ends in AccountChangeSets

bin/tools/migrate.rs:144-146

if !block_changesets.is_empty() {
    writer.append_account_changeset(block_changesets, current_block)?;
}

After the loop, blocks_processed is never incremented for the final batch of blocks. This means the final progress log will show incorrect completion numbers. While not a data correctness issue, it means the "Done" summary log will undercount blocks processed.

Fix this →


3. Significant code duplication in migrate_rocksdb_table

bin/tools/migrate.rs:232-303

The three match arms for TransactionHashNumbers, AccountsHistory, and StoragesHistory are nearly identical — each creates a cursor, batch, and counter, iterates with walk(None), puts entries in batches, and commits. This ~60 lines of duplicated logic is fragile (bugs fixed in one arm need to be fixed in all three) and hard to maintain.

Consider using a macro or making migrate_rocksdb_table generic over the table type via a trait bound, so the loop body is written once. At minimum, extract the repeated loop into a shared helper.

Fix this →


4. Receipts migration uses seek_exact per tx instead of sequential cursor walk

bin/tools/migrate.rs:154-158

for tx_num in body.first_tx_num..body.first_tx_num + body.tx_count {
    if let Some(receipt) = cursor.seek_exact(tx_num)?.map(|(_, r)| r) {

seek_exact performs a binary search / key lookup for each individual transaction number, which is O(log N) per entry. Since the outer loop iterates blocks sequentially and tx_num is monotonically increasing, this should use a sequential cursor walk or seek + next pattern instead. For chains with millions of transactions this could be significantly slower than necessary.

Fix this →


Medium Issues

5. --batch-size description is misleading

bin/tools/migrate_command.rs:27-28

/// Block batch size for processing.
#[arg(long, default_value = "10000")]
batch_size: u64,

The flag is described as "Block batch size for processing" but it's actually used as the RocksDB write batch commit interval (number of entries, not blocks). Static file migration doesn't use it at all. The description should read something like "Number of entries per RocksDB write batch commit".


6. can_migrate_receipts only checks receipts_log_filter, not global receipt pruning

bin/tools/migrate_command.rs:77

let can_migrate_receipts = prune_modes.receipts_log_filter.is_empty();

This skips migration only when contract-specific log filters exist. If there's global receipt pruning configured (e.g., prune_modes.receipts is set to prune all receipts after N blocks), the code will still attempt to migrate receipts into static files, but the data in MDBX may already be incomplete. Consider also checking prune_modes.receipts.is_none() or similar fields.


7. reth-stages workaround should have a tracking issue

bin/tools/Cargo.toml:36

reth-stages = { workspace = true, features = ["rocksdb"] }

The PR description acknowledges this is a workaround for a feature unification bug in reth-stages. This dependency is not imported anywhere in the code. While the explanation is in the PR description, the comment in Cargo.toml itself should reference a tracking issue or TODO so it doesn't become permanent dead weight. Add a comment explaining why this is here and when it can be removed.


8. Destructive default behavior (clearing MDBX) deserves extra friction

bin/tools/migrate_command.rs:172-212

The tool clears migrated MDBX tables by default (irreversible on large databases). While the --keep-mdbx flag exists, the default behavior being destructive is an unusual and risky UX pattern for a data migration tool. Consider inverting the default: require --clear-mdbx to delete data (i.e., keep by default, delete on explicit opt-in). The current behavior is documented via a warning log at runtime, but operators running this in scripts may not notice.


Low Issues

9. Final empty batch commit is slightly wasteful

bin/tools/migrate.rs:254, 276, 300

batch.commit()?;
count

After the loop, batch.commit() is called unconditionally. If count % batch_size == 0, the batch was already committed inside the loop and a new empty batch was created (batch = rocksdb.batch()). Committing an empty batch is likely a no-op but is slightly wasteful. Guard with if count % batch_size != 0 { batch.commit()?; }.


10. Silent no-op for unknown table name in migrate_rocksdb_table

bin/tools/migrate.rs:303

_ => 0,

If an unknown table name is passed, the function returns 0 silently. This is only called from migrate_to_rocksdb with known table names, so it's safe today. But returning an error or unreachable!() would be more robust.


11. senders_have_data check uses tx-level check rather than block-level

bin/tools/migrate_command.rs:148-151

let senders_have_data = static_file_provider
    .get_highest_static_file_tx(StaticFileSegment::TransactionSenders)
    .is_some();

For a chain where the genesis block is the only block and it has zero transactions, this returns None even if the migration wrote the genesis block correctly. The check could give a false negative, resulting in TransactionSenders not being enabled in storage settings even though the migration ran. Consider also checking get_highest_static_file_block.


12. No validation that static file and RocksDB migrations did not partially fail before clearing MDBX

bin/tools/migrate_command.rs:122-124

info!(target: "reth::cli", "Finalizing migration");
self.finalize::<N>(&provider_factory, can_migrate_receipts)?;

If skip_static_files = true but skip_rocksdb = false, the finalize step still clears both static file AND RocksDB tables if --keep-mdbx is not set (lines 180-212 check skip_static_files and skip_rocksdb correctly). This part looks correct on review — but it's subtle and worth a comment in the code to make the intent explicit.


Design Questions (from PR description, addressed)

Q1: reth-stages workaround — Acceptable short-term, but needs a // TODO: remove once upstream bug is fixed comment with a link to the upstream issue.

Q2: batch() vs batch_with_auto_commit() — The current approach (manual commit() every batch_size entries, then final commit()) is correct. batch() requires explicit commit; data is durably written only when commit() is called. The pattern is sound.

Q3: Default MDBX clearing behavior — See issue #8 above. Recommend inverting the default.

Q4: No checkpoint/resume for RocksDB — For large deployments this is a significant gap. If the migration crashes mid-way through the RocksDB phase, it must restart from the beginning. Worth tracking as a known limitation, and perhaps adding a --force-restart flag to make the intent explicit rather than silently restarting.

Q5: Complex turbofish for Receipts tabletx.clear::<tables::Receipts<<<N as reth_node_builder::NodeTypes>::Primitives as reth_primitives_traits::NodePrimitives>::Receipt>>() is verbose but correct if it compiles. A type alias at the top of the function would improve readability.

Q6: No tests — Given this is a destructive one-time migration tool, integration tests with a small test database would be especially valuable.


Minor Style Notes

  • The verbose per-table info! logs before and after each tx.clear() call (e.g., "Clearing TransactionSenders table..." / "TransactionSenders table cleared") are redundant — consider consolidating.
  • // Start tracking time comment on line 83 of migrate_command.rs is redundant given let start_time = Instant::now() is self-explanatory.

Verdict

The code compiles (per PR author) and the overall structure is sound — parallelizing static files vs RocksDB, using rayon within each, and the resumable static file migration design are all good choices. However, the AccountChangeSets gap-fill logic (#1) is a potential correctness bug for sparse chains, the code duplication (#3) and performance issue (#4) should be fixed, and the destructive-by-default behavior (#8) deserves careful discussion before merging. Tests are strongly recommended given the irreversible nature of this tool.

@JimmyShi22
Copy link
Contributor

@review pls comment inline

@github-actions
Copy link
Contributor

github-actions bot commented Mar 12, 2026

Claude finished @JimmyShi22's task in 2m 12s —— View job


Code Review — feat(tools): add legacy data migration subcommand

  • Read changed files
  • Analyzed code for bugs, performance, and correctness issues
  • Posted inline review comments on the PR

13 inline comments posted on the PR covering:

Critical / Correctness

  • migrate.rs:123ensure_at_block skipped for brand-new writers (potential corrupt static file for sparse chains)
  • migrate.rs:144blocks_processed not updated for the final block group after the loop

Performance

  • migrate.rs:155seek_exact per-tx is O(log N) on monotonically increasing keys; should use sequential cursor walk

Code Quality

  • migrate.rs:232-303 — ~60 lines of code duplicated across 3 RocksDB table arms; should be extracted to a generic helper
  • migrate.rs:254/277/300 — Potentially wasteful empty batch commit after loop
  • migrate.rs:303 — Silent no-op for unknown table name; unreachable!() would be more defensive
  • migrate_command.rs:193 — Deeply nested turbofish for Receipts type; a type alias would improve readability

Design / UX

  • migrate_command.rs:172 — Destructive-by-default MDBX clearing; recommend inverting default to require explicit --clear-mdbx flag
  • migrate_command.rs:77can_migrate_receipts only checks log filter, misses global receipt pruning
  • migrate_command.rs:148senders_have_data tx-level check gives false negative for genesis-only chains

Documentation

  • Cargo.toml:36reth-stages workaround needs an explanatory comment with tracking issue reference
  • migrate_command.rs:27--batch-size description is misleading (entries, not blocks)
  • migrate_command.rs:82 — Redundant // Start tracking time comment

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall structure is sound — parallelizing static files vs RocksDB migrations and using rayon within each is a good design. However there are several correctness concerns, a performance issue, and the destructive-by-default MDBX clearing behavior needs careful review before merging.

current_block,
)?;
}
if writer.current_block_number().is_some() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential correctness bug: ensure_at_block skipped for brand-new writers

When the writer is brand-new (no prior data), current_block_number() returns None, so the ensure_at_block(block.saturating_sub(1)) call is skipped entirely. This means blocks between start and the first entry in the cursor (blocks with no account changesets) will not be padded in the static file.

If the static file writer contract requires all blocks to be explicitly represented from the start, this will produce a corrupted file for chains where early blocks have no state changes. Either call ensure_at_block unconditionally, or explicitly handle the None case to pad from start to block - 1.

entries_processed += 1;
}

if !block_changesets.is_empty() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocks_processed is not updated for the final block after the loop

After the walk_range loop ends, blocks_processed is never incremented for current_block (the last group of blocks). The final progress logged in the summary will undercount. Consider adding blocks_processed += current_block - start + 1; (or similar) after the loop before the commit step.

for block in start..=to_block {
if let Some(body) = provider.block_body_indices(block)? {
for tx_num in body.first_tx_num..body.first_tx_num + body.tx_count {
if let Some(receipt) = cursor.seek_exact(tx_num)?.map(|(_, r)| r) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: seek_exact per-tx is O(log N) on monotonically increasing keys

seek_exact performs a key lookup for each individual tx_num. Since the outer block loop iterates sequentially and tx_num is monotonically increasing, this should instead use a sequential cursor walk — seek to body.first_tx_num once, then call cursor.next() for subsequent entries in the range. Using seek_exact per entry here is significantly slower than necessary on large chains.

let table_start = Instant::now();
let mut last_log = Instant::now();

let count = match table {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Significant code duplication across the three RocksDB table migration arms

The TransactionHashNumbers, AccountsHistory, and StoragesHistory arms are nearly identical (~60 lines repeated 3×): create cursor, create batch, iterate walk(None), put into batch, commit every batch_size, final commit. Any bug fix or enhancement needs to be applied in all three places.

Consider extracting the shared logic into a generic helper — e.g., a macro or a function generic over the table type via trait bounds — so the loop body is written once.

}
}

batch.commit()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially wasteful empty batch commit

When count % batch_size == 0, the batch was already committed inside the loop and a fresh batch = rocksdb.batch() was created. The unconditional batch.commit() here then commits an empty batch. This is likely a no-op but is wasteful. Guard it: if count % batch_size != 0 { batch.commit()?; }

(Same pattern applies to lines 277 and 300 in the AccountsHistory and StoragesHistory arms.)


// Check if TransactionSenders actually has data in static files
let static_file_provider = provider_factory.static_file_provider();
let senders_have_data = static_file_provider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

senders_have_data check may give false negative for genesis-only chains

get_highest_static_file_tx returns None for a chain where the genesis block has zero transactions (i.e., no transactions were ever appended). In that case, TransactionSenders static file storage would not be enabled even though the migration ran correctly. Consider also checking get_highest_static_file_block(StaticFileSegment::TransactionSenders).is_some() as a secondary condition.

provider.write_storage_settings(new_settings)?;

// Drop migrated MDBX tables unless --keep-mdbx is set
if !self.keep_mdbx {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Destructive-by-default: MDBX tables are cleared unless --keep-mdbx is passed

For a data migration tool operating on potentially TBs of irreplaceable data, having the default behavior be destructive and irreversible is risky. Users running this in scripts or automation may not realize data will be deleted. Consider inverting the default: keep MDBX tables by default and require an explicit --clear-mdbx flag to delete them. This is a safer UX pattern for one-way destructive operations.

warn!(target: "reth::cli", "Receipts will NOT be migrated due to contract log pruning");
}

// Start tracking time
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant comment

// Start tracking time is self-evident from let start_time = Instant::now(). The comment adds no information. Remove it to reduce noise.

reth-chainspec.workspace = true
reth-provider.workspace = true
reth-provider = { workspace = true, features = ["rocksdb"] }
reth-stages = { workspace = true, features = ["rocksdb"] }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Workaround dependency needs an explanatory comment and tracking issue

reth-stages with rocksdb feature is not imported or used anywhere in this crate — it's only here to fix a feature unification compilation error in upstream reth-stages. Without a comment, future maintainers will not know why this is here and may remove it or be confused.

Please add a comment explaining this:

# Workaround: reth-stages must be listed with `rocksdb` feature to fix upstream feature
# unification bug. Not actually imported by this crate. Remove once upstream fixes the issue.
reth-stages = { workspace = true, features = ["rocksdb"] }

Ideally reference an upstream tracking issue if one exists.


if can_migrate_receipts {
info!(target: "reth::cli", "Clearing Receipts table...");
tx.clear::<tables::Receipts<<<N as reth_node_builder::NodeTypes>::Primitives as reth_primitives_traits::NodePrimitives>::Receipt>>()?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fragile turbofish for Receipts table type — consider a type alias

tx.clear::<tables::Receipts<<<N as reth_node_builder::NodeTypes>::Primitives as reth_primitives_traits::NodePrimitives>::Receipt>>()?;

This deeply nested type expression is brittle and hard to read. A type alias at the top of finalize() would improve readability and maintainability:

type ReceiptTable<N> = tables::Receipts<
    <<N as reth_node_builder::NodeTypes>::Primitives
     as reth_primitives_traits::NodePrimitives>::Receipt
>;
tx.clear::<ReceiptTable<N>>()?;

@Vui-Chee Vui-Chee closed this Mar 13, 2026
Copy link
Contributor Author

@Vui-Chee Vui-Chee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 5 additional findings in Devin Review.

Open in Devin Review

Comment on lines +160 to +166
let new_settings = StorageSettings::legacy()
.with_receipts_in_static_files(can_migrate_receipts)
.with_account_changesets_in_static_files(true)
.with_transaction_senders_in_static_files(senders_have_data)
.with_transaction_hash_numbers_in_rocksdb(true)
.with_account_history_in_rocksdb(true)
.with_storages_history_in_rocksdb(true);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Storage settings ignore --skip-static-files and --skip-rocksdb flags, causing data to be read from empty backends

In finalize(), the StorageSettings are built unconditionally enabling all migrated backends (lines 160-166), regardless of whether --skip-static-files or --skip-rocksdb was passed. For example, .with_account_changesets_in_static_files(true) is hardcoded even if static file migration was skipped, and .with_transaction_hash_numbers_in_rocksdb(true) is hardcoded even if RocksDB migration was skipped. While the MDBX clearing code correctly checks these flags (bin/tools/migrate_command.rs:180 and bin/tools/migrate_command.rs:198), the settings themselves don't. After migration with e.g. --skip-rocksdb, the persisted settings would tell the node to look up transaction hash numbers, account history, and storage history from RocksDB—which is empty—instead of MDBX where the data still resides. This would cause lookup failures at runtime.

Suggested change
let new_settings = StorageSettings::legacy()
.with_receipts_in_static_files(can_migrate_receipts)
.with_account_changesets_in_static_files(true)
.with_transaction_senders_in_static_files(senders_have_data)
.with_transaction_hash_numbers_in_rocksdb(true)
.with_account_history_in_rocksdb(true)
.with_storages_history_in_rocksdb(true);
let new_settings = StorageSettings::legacy()
.with_receipts_in_static_files(!self.skip_static_files && can_migrate_receipts)
.with_account_changesets_in_static_files(!self.skip_static_files)
.with_transaction_senders_in_static_files(!self.skip_static_files && senders_have_data)
.with_transaction_hash_numbers_in_rocksdb(!self.skip_rocksdb)
.with_account_history_in_rocksdb(!self.skip_rocksdb)
.with_storages_history_in_rocksdb(!self.skip_rocksdb);
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants