-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Open
Description
Description
- Propose a configurable startup strategy that eagerly loads only recent (“hot”) segments, while leaving older (“cold”) segments to load lazily on first access.
- Propose to deprecate druid.segmentCache.lazyLoadOnStart in favour for configs that gives more flexibility to historical's segment cache loading during startup.
Motivation
- Non-lazy segment loading takes long if Historical segment count is high (observed ~22 minutes per Historical; ~39 hours cluster-wide).
- Lazy-loading improves startup time but initial queries over hot data can be slow.
- Many clusters primarily query the last N days/weeks; we can make that slice eager at startup to maintain query performance.
Proposal
Deprecate druid.segmentCache.lazyLoadOnStart in favor of a single strategy-driven config:
New: startupCacheLoadStrategy with options:
- loadLazily (all segments lazy)
- loadAllEagerly (all segments eager)
- loadEagerlyForPeriod (recent window eager, older lazy)
When loadEagerlyForPeriod is selected, require a loadPeriod config (ISO-8601 period, e.g., P7D, P30D).
Backward compatibility and migration
Keep reading druid.segmentCache.lazyLoadOnStart for at least a few more releases with a deprecation warning.
We can map true -> loadLazily, false -> loadAllEagerly.
Using the new startupCacheLoadStrategy overwrites the lazyLoadOnStart setting, [Optional: and a warning is logged if both settings are configured].
The pros of relying on the new config allows us to implement more load strategies that we want.
Config names are open for discussion, do drop some suggestions!