-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Historical Startup -- Configurable loading strategy #18687
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 10 commits
d47c8e6
34081ee
88407fc
9978f73
e15b8be
5064bcc
f0ced70
f325cf1
ae15550
05509cd
dc8c7d7
8351de1
019a3bd
84efec7
288223d
430e216
a0fd32e
6c641dc
4978b21
aa11aa0
a29350d
7e420d8
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1585,7 +1585,9 @@ These Historical configurations can be defined in the `historical/runtime.proper | |
| |`druid.segmentCache.announceIntervalMillis`|How frequently to announce segments while segments are loading from cache. Set this value to zero to wait for all segments to be loaded before announcing.|5000 (5 seconds)| | ||
| |`druid.segmentCache.numLoadingThreads`|How many segments to drop or load concurrently from deep storage. Note that the work of loading segments involves downloading segments from deep storage, decompressing them and loading them to a memory mapped location. So the work is not all I/O Bound. Depending on CPU and network load, one could possibly increase this config to a higher value.|max(1,Number of cores / 6)| | ||
| |`druid.segmentCache.numBootstrapThreads`|How many segments to load concurrently during historical startup.|`druid.segmentCache.numLoadingThreads`| | ||
| |`druid.segmentCache.lazyLoadOnStart`|Whether or not to load segment columns metadata lazily during historical startup. When set to true, Historical startup time will be dramatically improved by deferring segment loading until the first time that segment takes part in a query, which will incur this cost instead.|false| | ||
| |`druid.segmentCache.lazyLoadOnStart`|_DEPRECATED_ Use `druid.segmentCache.startupLoadStrategy` instead. Whether or not to load segment columns metadata lazily during historical startup. When set to true, Historical startup time will be dramatically improved by deferring segment loading until the first time that segment takes part in a query, which will incur this cost instead.|false| | ||
| |`druid.segmentCache.startupLoadStrategy`|Selects the segment column metadata loading strategy during historical startup. Possible values are `loadAllEagerly`, `loadAllLazily`, and `loadEagerlyBeforePeriod`. More details on each strategy below.|`loadAllEagerly`| | ||
| |`druid.segmentCache.startupLoadPeriod`| Used only when startup load strategy `loadEagerlyBeforePeriod` is configured. Suppose timestamp `t` is when the Historical started up. Any segment metadata with interval that does not overlap with the interval of `[t - startupLoadPeriod, t]` will be lazily loaded.|`P7D`| | ||
GWphua marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| |`druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnDownload`|Number of threads to asynchronously read segment index files into null output stream on each new segment download after the Historical service finishes bootstrapping. Recommended to set to 1 or 2 or leave unspecified to disable. See also `druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnBootstrap`|0| | ||
| |`druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnBootstrap`|Number of threads to asynchronously read segment index files into null output stream during Historical service bootstrap. This thread pool is terminated after Historical service finishes bootstrapping. Recommended to set to half of available cores. If left unspecified, `druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnDownload` will be used. If both configs are unspecified, this feature is disabled. Preemptively loading segments into page cache helps in the sense that later when a segment is queried, it's already in page cache and only a minor page fault needs to be triggered instead of a more costly major page fault to make the query latency more consistent. Note that loading segment into page cache just does a blind loading of segment index files and will evict any existing segments from page cache at the discretion of operating system when the total segment size on local disk is larger than the page cache usable in the RAM, which roughly equals to total available RAM in the host - druid process memory including both heap and direct memory allocated - memory used by other non druid processes on the host, so it is the user's responsibility to ensure the host has enough RAM to host all the segments to avoid random evictions to fully leverage this feature.|`druid.segmentCache.numThreadsToLoadSegmentsIntoPageCacheOnDownload`| | ||
|
|
||
|
|
@@ -1602,6 +1604,14 @@ In `druid.segmentCache.locationSelector.strategy`, one of `leastBytesUsed`, `rou | |
|
|
||
| Note that if `druid.segmentCache.numLoadingThreads` > 1, multiple threads can download different segments at the same time. In this case, with the `leastBytesUsed` strategy or `mostAvailableSize` strategy, Historicals may select a sub-optimal storage location because each decision is based on a snapshot of the storage location status of when a segment is requested to download. | ||
|
|
||
| In `druid.segmentCache.startupLoadStrategy`, one of `loadAllEagerly`, `loadAllLazily`, or `loadEagerlyBeforePeriod` could be specified to represent the strategy to load segments when starting the Historical service. | ||
|
|
||
| |Strategy|Description| | ||
| |--------|-----------| | ||
| |`loadAllEagerly`|The default startup strategy. The Historical service will load all segment column metadata immediately during the initial startup process.| | ||
| |`loadAllLazily`|To significantly improve historical system startup time, segments are not loaded during the initial startup sequence. Instead, the loading cost is deferred, and will be incurred the first time a segment is referenced by a query.| | ||
| |`loadEagerlyBeforePeriod`|Provides a balance between fast startup and query performance. The Historical service will eagerly load column metadata only for segments that fall within the most recent period defined by `druid.segmentCache.startupLoadPeriod`. Segments outside this recent period will be loaded on-demand when first queried.| | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How feasible/extensible is it to accept a map of datasource to load period, to allow configurable periods per datasource? (similar to the I think having that option would allow a lot more flexibility to operators as the query workloads can be vastly different.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is workable -- I can change e.g. Where
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My opinion is that let's keep the change in this PR small enough. for datasource level configuration, if there's really need for this feature, we can implement it by defining a datasource level configuration
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I feel we can leave this for another PR, since it is out of scope of this intended PR. WDYT? @abhishekrb19 |
||
|
|
||
| #### Historical query configs | ||
|
|
||
| ##### Concurrent requests | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -22,7 +22,10 @@ | |
| import com.fasterxml.jackson.annotation.JacksonInject; | ||
| import com.fasterxml.jackson.annotation.JsonProperty; | ||
| import com.google.common.collect.Lists; | ||
| import org.apache.druid.server.coordination.startup.LoadAllEagerlyStrategy; | ||
| import org.apache.druid.server.coordination.startup.LoadAllLazilyStrategy; | ||
| import org.apache.druid.utils.RuntimeInfo; | ||
| import org.joda.time.Period; | ||
|
|
||
| import java.io.File; | ||
| import java.util.Collections; | ||
|
|
@@ -41,9 +44,19 @@ | |
| @JsonProperty | ||
| private List<StorageLocationConfig> locations = Collections.emptyList(); | ||
|
|
||
| /** | ||
| * @deprecated Use {@link #startupLoadStrategy} instead. | ||
| */ | ||
| @Deprecated | ||
| @JsonProperty("lazyLoadOnStart") | ||
| private boolean lazyLoadOnStart = false; | ||
|
|
||
| @JsonProperty("startupLoadStrategy") | ||
| private String startupLoadStrategy = null; | ||
|
|
||
| @JsonProperty("startupLoadPeriod") | ||
| private Period startupLoadPeriod = new Period("P7D"); | ||
|
|
||
| @JsonProperty("deleteOnRemove") | ||
| private boolean deleteOnRemove = true; | ||
|
|
||
|
|
@@ -84,11 +97,31 @@ | |
| return locations; | ||
| } | ||
|
|
||
| /** | ||
| * @deprecated Use {@link #getStartupCacheLoadStrategy()} instead. | ||
| * Removal of this method in the future will requires a change in {@link #getStartupCacheLoadStrategy()} | ||
| * to default to {@link LoadAllEagerlyStrategy#STRATEGY_NAME} when {@link #startupLoadStrategy} is null. | ||
| */ | ||
| @Deprecated | ||
| public boolean isLazyLoadOnStart() | ||
| { | ||
| return lazyLoadOnStart; | ||
| } | ||
|
|
||
| public String getStartupCacheLoadStrategy() | ||
GWphua marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| { | ||
| return startupLoadStrategy == null | ||
| ? isLazyLoadOnStart() | ||
|
||
| ? LoadAllLazilyStrategy.STRATEGY_NAME | ||
| : LoadAllEagerlyStrategy.STRATEGY_NAME | ||
| : startupLoadStrategy; | ||
| } | ||
|
|
||
| public Period getStartupLoadPeriod() | ||
| { | ||
| return startupLoadPeriod; | ||
| } | ||
|
|
||
| public boolean isDeleteOnRemove() | ||
| { | ||
| return deleteOnRemove; | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.druid.server.coordination.startup; | ||
|
|
||
| import org.apache.druid.timeline.DataSegment; | ||
|
|
||
| public interface HistoricalStartupCacheLoadStrategy | ||
| { | ||
| boolean shouldLoadLazily(DataSegment segment); | ||
GWphua marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,43 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.druid.server.coordination.startup; | ||
|
|
||
| import org.apache.druid.error.DruidException; | ||
| import org.apache.druid.segment.loading.SegmentLoaderConfig; | ||
|
|
||
| public class HistoricalStartupCacheLoadStrategyFactory | ||
| { | ||
| public static HistoricalStartupCacheLoadStrategy factorize(SegmentLoaderConfig config) | ||
| { | ||
| String strategyName = config.getStartupCacheLoadStrategy(); | ||
| switch (strategyName) { | ||
| case LoadAllLazilyStrategy.STRATEGY_NAME: | ||
| return new LoadAllLazilyStrategy(); | ||
| case LoadAllEagerlyStrategy.STRATEGY_NAME: | ||
| return new LoadAllEagerlyStrategy(); | ||
| case LoadEagerlyBeforePeriod.STRATEGY_NAME: | ||
| return new LoadEagerlyBeforePeriod(config.getStartupLoadPeriod()); | ||
| default: | ||
| throw DruidException.forPersona(DruidException.Persona.OPERATOR) | ||
| .ofCategory(DruidException.Category.UNSUPPORTED) | ||
| .build("Unknown configured Historical Startup Loading Strategy[%s]", strategyName); | ||
GWphua marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| } | ||
| } | ||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.druid.server.coordination.startup; | ||
|
|
||
| import org.apache.druid.java.util.common.logger.Logger; | ||
| import org.apache.druid.timeline.DataSegment; | ||
|
|
||
| public class LoadAllEagerlyStrategy implements HistoricalStartupCacheLoadStrategy | ||
| { | ||
| private static final Logger log = new Logger(LoadAllEagerlyStrategy.class); | ||
|
|
||
| public static final String STRATEGY_NAME = "loadAllEagerly"; | ||
|
|
||
| public LoadAllEagerlyStrategy() | ||
| { | ||
| log.info("Using [%s] strategy", STRATEGY_NAME); | ||
| } | ||
|
|
||
| @Override | ||
| public boolean shouldLoadLazily(DataSegment segment) | ||
| { | ||
| return false; | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.druid.server.coordination.startup; | ||
|
|
||
| import org.apache.druid.java.util.common.logger.Logger; | ||
| import org.apache.druid.timeline.DataSegment; | ||
|
|
||
| public class LoadAllLazilyStrategy implements HistoricalStartupCacheLoadStrategy | ||
| { | ||
| private static final Logger log = new Logger(LoadAllLazilyStrategy.class); | ||
|
|
||
| public static final String STRATEGY_NAME = "loadAllLazily"; | ||
|
|
||
| public LoadAllLazilyStrategy() | ||
| { | ||
| log.info("Using [%s] strategy", STRATEGY_NAME); | ||
| } | ||
|
|
||
| @Override | ||
| public boolean shouldLoadLazily(DataSegment segment) | ||
| { | ||
| return true; | ||
| } | ||
| } |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| /* | ||
| * Licensed to the Apache Software Foundation (ASF) under one | ||
| * or more contributor license agreements. See the NOTICE file | ||
| * distributed with this work for additional information | ||
| * regarding copyright ownership. The ASF licenses this file | ||
| * to you under the Apache License, Version 2.0 (the | ||
| * "License"); you may not use this file except in compliance | ||
| * with the License. You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, | ||
| * software distributed under the License is distributed on an | ||
| * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| * KIND, either express or implied. See the License for the | ||
| * specific language governing permissions and limitations | ||
| * under the License. | ||
| */ | ||
|
|
||
| package org.apache.druid.server.coordination.startup; | ||
|
|
||
| import com.google.common.annotations.VisibleForTesting; | ||
| import org.apache.druid.java.util.common.DateTimes; | ||
| import org.apache.druid.java.util.common.logger.Logger; | ||
| import org.apache.druid.timeline.DataSegment; | ||
| import org.joda.time.DateTime; | ||
| import org.joda.time.Interval; | ||
| import org.joda.time.Period; | ||
|
|
||
| public class LoadEagerlyBeforePeriod implements HistoricalStartupCacheLoadStrategy | ||
GWphua marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| { | ||
| private static final Logger log = new Logger(LoadEagerlyBeforePeriod.class); | ||
| public static final String STRATEGY_NAME = "loadEagerlyBeforePeriod"; | ||
|
|
||
| private final Interval eagerLoadingWindow; | ||
GWphua marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| public LoadEagerlyBeforePeriod(Period eagerLoadingPeriod) | ||
| { | ||
| DateTime now = DateTimes.nowUtc(); | ||
| this.eagerLoadingWindow = new Interval(now.minus(eagerLoadingPeriod), now); | ||
|
|
||
| log.info("Using [%s] strategy with Interval[%s]", STRATEGY_NAME, eagerLoadingWindow); | ||
| } | ||
|
|
||
| @VisibleForTesting | ||
| public Interval getEagerLoadingWindow() | ||
| { | ||
| return this.eagerLoadingWindow; | ||
| } | ||
|
|
||
| @Override | ||
| public boolean shouldLoadLazily(DataSegment segment) | ||
| { | ||
| return !segment.getInterval().overlaps(eagerLoadingWindow); | ||
| } | ||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.