Releases: intel/ScalableVectorSearch
v0.0.10
What's Changed
- Support for OpenMP threading (#170, #171, #172)
- Enable Inverted File index (#156)
- Resolution of inf/nan crashes in LVQ (#174, #178)
- SVS_LAZY update to address alpine issue (#175)
- Updates to build parameter defaults (#176)
- MKL symbols not exported from SVS libs
Full Changelog: v0.0.9...v0.0.10
v1.0.0-dev
Rolling release page for next major release, 1.0.0. This page provides nightly builds with the latest commits to main. See v0.0.9...main
v0.0.9
What's Changed
- fix: Multi-vector dynamic vamana index Save/Load functionality by @rfsaliev in #162
- feature: save and load in flat index by @yuejiaointel in #163
- Handle corner in dynamic index with insufficient valid search results by @ibhati in #164
- change to v0.0.9 by @ahuber21 in #169
Full Changelog: v0.0.8...v0.0.9
v0.0.8
-
Addition of 8-bit scalar quantization support to C++ interface
-
Introduced multi-vector index and batch iterator support that allows multiple vectors to be mapped to the same external ID
-
Automatic ISA dispatching with optimizations based on AVX support
-
Enabled compatibility with ARM and MacOS
-
Enhanced logging capabilities
-
Updated vamana iterator API
-
Broader shared library support:
-
gcc-11+, clang-18+, glibc 2.26+ compatibility
-
Static library provided in addition to .so
-
Intel(R) MKL linked within the shared library - no need for Intel(R) MKL in user environment
-
Note
SVS shared and static libraries are included in the tarballs. Binary variants included in this release:
svs-shared-library-0.0.8.tar.gz: Standard build for GCC 11+ and glibc 2.28+ — compatible with most modern Linux distributionsglibc2_26suffix: Build for GCC 11+ and glibc 2.26 (e.g. Amazon Linux 2)clangsuffix: Build for Clang-18+reducedsuffix: Lighter-weight builds for specific integrations — not recommended for general use
v0.0.8-dev
Note that the shared library binaries are built with gcc-11 (unless suffixed with "clang" - which is built with clang-18), GLIBC 2.28, oneAPI 2024.1, cmake 3.26.5. These versions or later are required for use of the shared library (except oneAPI/MKL which is only required if using LeanVec). All 0.0.8 binaries are statically linked to MKL and do not require it in user env.
Note that most of these are compatible with AVX2 and above, unless specifically suffixed with AVX512 (these are compatible with AVX512 only)
v0.0.7
SVS 0.0.7 Release Notes
-
Implemented batch iterator support for hybrid search
-
Added support for custom threading and memory allocation
-
Introduced a timeout feature for search calls
-
Introduced
reuse_emptyflag in dynamic Vamana, enabling users to choose whether to reuse empty entries that may exist after deletion and consolidation -
Enhanced heuristics in the Vamana construct to improve efficiency when adding a small number of points.
Note that the shared library binaries are built with gcc 11.2.0, GLIBC 2.28, oneAPI 2024.1, cmake 3.26.5. These versions or later are required for use of the shared library (except oneAPI/MKL which is only required if using LeanVec). Also note that some of these are under active development and may change.
Use the avx512 binaries on machines with AVX512 instruction support for best performance.
v0.0.6
SVS 0.0.6 Release Notes
Please note that this repository only contains the open-source portion of the SVS library, which supports all functionalities and features described in the documentation, except for our proprietary vector compression techniques, specifically LVQ [ABHT23] and Leanvec [TBAH24]. These techniques are closed-source and supported exclusively on Intel hardware. We provide shared library and PyPI package to enable these vector compression techniques in C++ and Python, respectively.
v0.0.3
SVS 0.0.3 Release Notes
Highlighted Features
-
Turbo LVQ: A SIMD optimized layout for LVQ that can improve end-to-end search
performance for LVQ-4 and LVQ-4x8 encoded datasets. -
Split-buffer: An optimization that separates the search window size used during greedy
search from the actual search buffer capacity. For datasets that use reranking (two-level
LVQ and LeanVec), this allows more neighbors to be passed to the reranking phase without
increasing the time spent in greedy search. -
LeanVec dimensionality reduction is now included as
an experimental feature!
This two-level technique uses a linear transformation to generate a primary dataset with
lower dimensionality than full precision vectors.
The initial portion of a graph search is performed using this primary dataset, then uses
the full precision secondary dataset to rerank candidates.
Because of the reduced dimensionality, LeanVec can greatly accelerate index constructed
for high-dimensional datasets.As an experimental feature, future changes to this API are expected.
However, the implementation in this release is sufficient to enable experimenting with
this technique on your own datasets!
New Dependencies
pysvs (Python)
Additions and Changes
-
Added the
LeanVecLoaderclass as a dataset loader enabling use of
LeanVec dimensionality reduction.The main constructor is shown below:
pysvs.LeanVecLoader( loader: pysvs.VectorDataLoader, leanvec_dims: int, primary: pysvs.LeanVecKind = pysvs.LeanVecKind.lvq8, secondary: pysvs.LeanVecKind = pysvs.LeanVecKind.lvq8 )where:
loaderis the loader for the uncompressed dataset.leanvec_dimsis the target reduced dimensionality of the primary dataset.
This should be less thanloader.dimsto provide a performance boost.primaryis the encoding to use for the reduced-dimensionality dataset.secondaryis the encoding to use for the full-dimensionality dataset.
Valid options for
pysvs.LeanVecKindare:float16, float32, lvq4, lvq8.See the documentation for docstrings and an example.
-
Search parameters controlling recall and performance for the Vamana index are now set and
queried through apysvs.VamanaSearchParametersconfiguration class. The layout of this
class is as follows:class VamanaSearchParameters Parameters controlling recall and performance of the VamanaIndex. See also: `Vamana.search_parameters`. Attributes: buffer_config (`pysvs.SearchBufferConfig`, read/write): Configuration state for the underlying search buffer. search_buffer_visited_set (bool, read/write): Enable/disable status of the search buffer visited set.with
pysvs.SearchBufferConfigdefined byclass pysvs.SearchBufferConfig Size configuration for the Vamana index search buffer. See also: `pysvs.VamanSearchParameters`, `pysvs.Vamana.search_parameters`. Attributes: search_window_size (int, read-only): The number of valid entries in the buffer that will be used to determine stopping conditions for graph search. search_buffer_capacity (int, read-only): The (expected) number of valid entries that will be available. Must be at least as large as `search_window_size`.Example usage is shown below.
index = pysvs.Vamana(...); # Get the current parameters of the index. parameters = index.search_parameters print(parameters) # Possible Output: VamanaSearchParameters( # buffer_config = SearchBufferConfig(search_window_size = 0, total_capacity = 0), # search_buffer_visited_set = false # ) # Update our local copy of the search parameters parameters.buffer_config = pysvs.SearchBufferConfig(10, 20) # Assign the modified parameters to the index. Future searches will be affected. index.search_parameters = parameters
-
Split search buffer for the Vamana search index. This is achieved by using different
values for thesearch_window_sizeandsearch_buffer_capacityfields of the
pysvs.SearchBufferConfigclass described above.An index configured this way will maintain more entries in its search buffer while still
terminating search relatively early. For implementation like two-level LVQ that use
reranking, this can boost recall without significantly increasing the effective
search window size.For uncompressed indexes that do not use reranking, split-buffer can be used to decrease
the search window size lower than the requested number of neighbors (provided the
capacity is at least the number of requested neighbors). This enables continued trading
of recall for search performance. -
Added
pysvs.LVQStrategyfor picking between different flavors of LVQ. The values
and meanings are given below.Auto: Let pysvs decide from among the available options.Sequential: Use the original implementation of LVQ which bit-packs subsequent vector
elements sequentially in memory.Turbo: Use an experimental implementation of LVQ that permutes the packing of
subsequent vector elements to permit faster distance computations.
The selection of strategy can be given using the
strategykeyword argument of
pysvs.LVQLoaderand defaults topysvs.LVQStrategy.Auto. -
Index construction and loading methods will now list the registered index specializations.
-
Assigning the
paddingkeyword toLVQLoaderwill now be respected when reloading a
previously saved LVQ dataset. -
Changed the implementation of the greedy-search visited set to be effective when operating
in the high-recall/high-neighbors regime. It can be enabled with:index = pysvs.Vamana(...) p = index.search_parameters p.search_buffer_visited_set = True index.search_parameters = p
Experimental Features
Features marked as experimental are subject to rapid API changes, improvement, and
removal.
-
Added the
experimental_backend_stringread-only parameter topysvs.Vamanato aid in
recording and debugging the backend implementation. -
Introduced
pysvs.Vamana.experimental_calibrateto aid in selecting the best runtime
performance parameters for an index to achieve a desired recall.This feature can be used as follows:
# Create an index index = pysvs.Vamana(...) # Load queries and groundtruth queries = pysvs.read_vecs(...) groundtruth = pysvs.read_vecs(...) # Optimize the runtime state of the index for 0.90 10-recall-at-10 index.experimental_calibrate(queries, groundtruth, 10, 0.90)
See the documentation for a more detailed explanation.
Deprecations
-
Versions
0.0.1and0.0.2ofVamanaConfigParameters(the top-level configuration file
for the Vamana index) are deprecated. The current version is nowv0.0.3. Older versions
will continue to work until the next minor release of SVS.To upgrade, use the
convert_legacy_vamana_indexbinary utility described below. -
The attribute
pysvs.Vamana.visisted_set_enabledis deprecated and will be removed in the
next minor release of SVS. It is being replaced withpysvs.Vamana.search_parameters. -
The LVQ loader classes
pysvs.LVQ4,pysvs.LVQ8,pysvs.LVQ4x4,pysvs.LVQ4x8and
pysvs.LVQ8x8are deprecated in favor of a single classpysvs.LVQLoader. This class
has similar arguments to the previous family, but encodes the number of bits for the
primary and residual datasets as run-time values.For example,
# Old loader = pysvs.LVQ4x4("dataset", dims = 768, padding = 32) # New loader = pysvs.LVQLoader("dataset", primary = 4, residual = 4, dims = 768, padding = 32)
-
Version
v0.0.2of serialized LVQ datasets is broken, the current version is now
v0.0.3. This change was made to facilitate a canonical on-disk representation of LVQ.Goind forward, previously saved LVQ formats can be reloaded using different runtime
alignments and different packing strategies without requiring whole dataset recompression.Any previously saved datasets will need to be regenerated from uncompressed data.
Build System Changes
Building pysvs using cibuildwheel now requires a custom docker container with MKL.
To build the container, run the following commands:
cd ./docker/x86_64/manylinux2014/
./build.shlibsvs (C++)
Changes
- Added
svs::index::vamana::VamanaSearchParametersand
svs::index::vamana::SearchBufferConfig. The latter contains parameters for the search
buffer sizing while the former groups all algorithmic and performance parameters of search
together in a single class. - API addition of
get_search_parameters()andset_search_parameters()tosvs::Vamana
andsvs::DynamicVamanaas the new API for getting and setting all search parameters. - Introducing split-buffer for the search buffer (see description in the Python section)
to potentially increase recall when using reranking. - Overhauled LVQ implementation, adding an additional template parameter to
lvq::CompressedVectorBaseand friends. This parameter assumes the following types:lvq::Sequential: Store dimension encodings sequentially in memory. This corresponds
to the original LVQ implementation.lvq::Turbo<size_t Lanes, size_t ElementsPerLane>: Use a SIMD optimized format,
optimized to useLanesSIMD lanes, storingElementsPerLane. Selection of these
parame...
v0.0.2
SVS 0.0.2 Release Notes
pysvs (Python)
- Deprecated
num_threadskeyword argument frompysvs.VamanaBuildParametersand added
num_threadskeyword topysvs.Vamana.build. - Exposed the
prune_toparameter forpysvs.VamanaBuildParameters(see description below
for an explanation of this change). - Added preliminary support for building
pysvs.Flatandpysvs.Vamanadirectly from
np.float16arrays.
libsvs (C++)
Breaking Changes
- Removed
nthreadsmember ofVamanaBuildParametersand added the number of threads as
an argument tosvs::Vamana::build/svs::Vamana::build. - Added a
prune_toargument toVamanaBuildParameters. This can be set to a value less
than graph_max_degree (heuristically, setting this to be 4 less is a good trade-off
between accuracy and speed). When pruning is performed, this parameter is used to
determine the number of candidates to generate after pruning. Setting this less than
graph_max_degreegreatly reduces the time spent when managing backedges. - Improved pruning rules for Euclidean and InnerProduct. Vamana index construction should
be faster and yield slightly improved indexes. - Added an experimental external-threading interface to
svs::index::VamanaIndex. - Overhauled extension mechanisms using a
tag_invokestyle approach. This decouples the
svs::index::VamanaIndeximplementation from extensions like LVQ, reducing header
dependence and improving precision of algorithm customization.
Save/Load API
- Enabled context-free saving and loading of simple data structures. This allows simple
data structures to be saved and reloaded from TOML files without requiring access to the
saving/loading directory. Classes implementing this saving and loading allow for more
flexible storage. - Overhauled the implementation of saving and loading to enable more scalable implementation.
svs::data::SimpleDatafamily of data structures are now directly saveable and loadable
and no longer require proxy-classes.
Breaking Serialization Changes
- Changed LVQ-style datasets from
v0.0.1tov0.0.2: Removed centroids from being stored
with the ScaledBiasedCompressedDataset. Centroids are now stored in the higher level LVQ
dataset.
Back-end Changes
Changes to library internals that do not necessarily affect the top level API but could
affect performance or users relying on internal APIs.
- Improved the performance of the LVQ inner-product implementation.
- Moved dynamic uispatcher from the Python bindings into
libsvs. - Data structure loading has been augmented with the
svs::lib::Lazyclass, allowing for
arbitrary deferred work to be executed when loading data structures. - Removed the old "access mode" style API for multi-level datasets, instead using
tag_invokefor customization. - Reduced binary footprint by removing
std::functionuse for general multi-threaded
functions. - Updated
ANNExceptionto usefmtlibstyle message directly rather thanstd::ostream
style overloading. The new syntax turnstoANNEXCEPTION("Expected ", a, ", got ", b, "!");
ANNEXCEPTION("Expected {}, got {}!", a, b);
Binaries and Utilities
- Added a benchmarking framework in
/benchmarkto automatically run and aggregate index
construction and search for large scale benchmarks. Documentation is currently sparse
but planned.
Third Party
- Bump fmtlib from 9.1.0 to 10.1.1.
v0.0.1
Initial tagged version of the code as a VLDB'23 artifact.