Skip to content

Bump vendored DuckDB in v1.5-variegata to include TaskExecutor busy-spin fix (upstream PR #22092) #667

@wdroste

Description

@wdroste

Summary

duckdb_jdbc v1.5-variegata CI builds (confirmed via run 24636193614 on 2026-04-19) ship byte-for-byte identical libduckdb_java.so_linux_amd64 to the v1.5.2.0 release (MD5 9bf6bc9415ad3ba3cdcddc76b6a5cebe). Because of this, the upstream DuckDB fix for PR #22092"Cooperative tasks might lead to busy spinning in TaskExecutor::WorkOnTasks" — that's been on v1.5-variegata since 2026-04-16 is not available in any downloadable duckdb_jdbc artifact.

Please bump the vendored DuckDB under src/duckdb/ to current v1.5-variegata HEAD (or cherry-pick DuckDB commit fa3f53318349a5e47ff20e5bc1e6863772f86d5e / merge 67fe1eed4f) so consumers of duckdb_jdbc get the fix.

Background

We hit the busy-spin in production under duckdb/duckdb-postgres 1.5.2.0 while executing ATTACH 'postgresql://...' (TYPE postgres, SCHEMA '...'). One thread pinned at 100% CPU for 9+ minutes, RUNNABLE, no syscalls, holding a caller-side lock that stalled 33 sibling threads.

Symptoms

jstack on the hung thread consistently shows:

RUNNABLE  cpu ≈ elapsed  (100% CPU, no syscalls)
  org.duckdb.DuckDBNative.duckdb_jdbc_execute_pending  (Native)
  org.duckdb.DuckDBPreparedStatement.executeDirect
  // ... our ATTACH statement execution

async-profiler flame graph (itimer mode, 30s, on the JVM while hung) shows the hot native leaf inside duckdb_jdbc_execute_pending:

duckdb_moodycamel::ConcurrentQueue<shared_ptr<duckdb::Task>>::ExplicitProducer::dequeue<...>
← duckdb::TaskScheduler::GetTaskFromProducer
← duckdb::TaskExecutor::WorkOnTasks                ← the empty-body while loop
← (called from the JNI pending-result poll)

Plus the characteristic inline-frame noise from unique_ptr<ActiveQueryContext/Executor/ProducerToken>::AssertNotNull — those are hit on every iteration of the spin because the empty-body loop does nothing else.

Root cause (for reference)

src/parallel/task_executor.cpp in v1.5.2 has:

void TaskExecutor::WorkOnTasks() {
    shared_ptr<Task> task_from_producer;
    while (scheduler.GetTaskFromProducer(*token, task_from_producer)) {
        // ... execute ...
    }
    // wait for all active tasks to finish
    while (completed_tasks != total_tasks) {      // ← empty-body busy spin
    }
    // ...
}

Once GetTaskFromProducer returns false (queue empty), the second while-loop spins until another thread completes all outstanding tasks, with no yield() / pause / wait primitive. On any ATTACH where an internal query (e.g. PostgresConnection::GetPostgresVersion's SELECT version(), (SELECT COUNT(*) FROM pg_settings WHERE name LIKE 'rds%')) is still in the "waiting for worker completion" phase, the caller thread pegs a core indefinitely.

The DuckDB PR #22092 fix merges the two loops and calls std::this_thread::yield() in the empty-queue branch — correct.

What's been confirmed

  • ✅ Fix is in DuckDB v1.5-variegata as of 67fe1eed4f (2026-04-16).
  • ✅ Fix is not in DuckDB tag v1.5.2.
  • ✅ duckdb-java v1.5-variegata CI build from 2026-04-19 produces byte-identical .so to v1.5.2.0 release (same MD5).
  • ✅ Snapshot publishing to Maven Central is known-broken (Restore snapshots publishing to Maven Central #338) so there is no nightly jar to pull.

Ask

  1. Bump src/duckdb/ to a DuckDB revision that includes 67fe1eed4f (e.g. current v1.5-variegata HEAD).
  2. Cut a patch release — even just an unreleased CI build / tagged pre-release with a matching upload would help production consumers self-serve while a formal 1.5.3 cut is scheduled.

Happy to provide the full flame graph HTML, jstack, and native-lib hashes if helpful.

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions