-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Is your feature request related to a problem? Please describe
While executing the fetch phase for the queries, we iterate overall desired document ids to load and fetch the documents using StoredFieldReader sequentially, this is probably fine on faster storage devices. But on slower storage this latency significantly adds up.
I tested the impact of this on a "slower" storage device and fetching 10 documents in a very small data set by making sure none of the data is in page cache.
I executed a simple term query (on this 11Gb) data set and on cold page cache this term query took ~200ms. Out of this ~150ms or so were contributed to fetch phase.
Describe the solution you'd like
Lucene supports prefetching in storedfield reader, that takes docId as an input. Which enable use to prefetch the data for required docId asynchronously. Based on how many documents needs to be fetched we can also think of doing this concurrently (for eg if we want to return 100 documents versus 10 or so).
In some cases fdt files are loaded as niofs files, we can also implement prefetching in niofs using posix fadvices. We probably have to maintain a separate buffer in niofs index input for prefetched data.
Related component
No response
Describe alternatives you've considered
No response
Additional context
No response
Metadata
Metadata
Assignees
Labels
Type
Projects
Status