ANN Benchmark
All logos are embedded through a computer vision model. In order to help logos annotation, the nearest neighbors of each logo must be easily accesible. A benchmark has been performed to determine the best way to perform a nearest neighbors search among the following HNSW indexes:
- (faiss)[https://github.com/facebookresearch/faiss]
- (redis)[https://redis.io/docs/stack/search/reference/vectors/]
- (elasticsearch)[https://www.elastic.co/fr/blog/introducing-approximate-nearest-neighbor-search-in-elasticsearch-8-0]
For each framework, multiple parameters were used to find the best index in our use case.
Our Use Case
The data used to test the performance of the indexes was made of 4.371.343 vectors (each one issued from the CLIP vision model with a logo as input) of dimension 768 and type float32.
The dimension of each vector used in the benchmark is 768, the size of embeddings CLIP outputs before processing through the projection layer. However, in production, as this last layer of the CLIP model is used, the dimension of the embeddings that are indexed is actually 512. The process to go from 768 to 512 being only a projection, the change of dimensionality should not impact the results. A test of elasticsearch with 512 dimension vectors did indeed lead to very similar results as the ones computed with 768 dimension vectors.
The goal is to search for nearest neighbors in real time. The search time must be short. The precision of the search matters as it is important to get logos as close as possible from the query logo for the annotators to be efficient.
The feature is expected to work in production on machines that run other tasks in parallel. The memory used by the index and search must thus be as small as possible.
Conditions of the tests
All HNSW results were compared to FLAT faiss index search (an exact search) results to compute the precision of the index. Only cosine similarity was usedin this benchmark.
HNSW indexes are adjustable trhough three parameters:
- m: the number of connections that each node in the index has to its neighbors
- efConstruction: the size of the priority queue used during the construction of the index
- efSearch/efRuntime/num_candidates: the size of the priority queue used during search operations
Benchmark
You can find the code of the various benchmarks here.
Here are the results obtained with Faiss :
m | efConstruction | efSearch | micro-recall@1 | micro-recall@4 | micro-recall@10 | micro-recall@50 | micro-recall@100 |
---|---|---|---|---|---|---|---|
- | - | - | - | - | - | - | - |
8 | 128 | 64 | 0.582 | 0.69475 | 0.706 | 0.70754 | 0.6829 |
8 | 128 | 128 | 0.591 | 0.70925 | 0.7238 | 0.72874 | 0.72166 |
8 | 256 | 64 | 0.598 | 0.714 | 0.7265 | 0.72542 | 0.70022 |
8 | 256 | 128 | 0.603 | 0.72225 | 0.7356 | 0.74082 | 0.7346 |
16 | 128 | 64 | 0.605 | 0.72875 | 0.7393 | 0.74212 | 0.72658 |
16 | 128 | 128 | 0.615 | 0.737 | 0.7476 | 0.75206 | 0.74833 |
16 | 256 | 64 | 0.623 | 0.74175 | 0.7501 | 0.75244 | 0.73805 |
16 | 256 | 128 | 0.624 | 0.745 | 0.7546 | 0.7581 | 0.75509 |
The recall never outreaches 0.76. A better recall is wanted. The Redis HNSW indexes were thus explored.
Here are the results obtained with Redis :
m | efConstruction | efRuntime | micro-recall@1 | micro-recall@4 | micro-recall@10 | micro-recall@50 | micro-recall@100 |
---|---|---|---|---|---|---|---|
- | - | - | - | - | - | - | - |
4 | 128 | 64 | 0.799 | 0.795 | 0.7968 | 0.76916 | 0.72839 |
4 | 128 | 128 | 0.822 | 0.81875 | 0.8215 | 0.79918 | 0.76812 |
4 | 256 | 64 | 0.83 | 0.8305 | 0.8254 | 0.79126 | 0.74795 |
4 | 256 | 128 | 0.855 | 0.8565 | 0.8511 | 0.82204 | 0.78854 |
8 | 128 | 64 | 0.955 | 0.94075 | 0.9318 | 0.90614 | 0.88248 |
8 | 128 | 128 | 0.964 | 0.95 | 0.9403 | 0.91784 | 0.90007 |
8 | 256 | 64 | 0.964 | 0.9515 | 0.9424 | 0.91672 | 0.89339 |
8 | 256 | 128 | 0.969 | 0.95725 | 0.9477 | 0.92514 | 0.90855 |
The efficiency of these indexes, with the right parameters are better.
However, Redis loads all the embeddings in RAM when loading the index, which accounts for more than 20 GB of RAM used by it. This is too much for our use case.
The last tool explored is ElasticSearch. With the default parameters, here is what was obtained:
m | efConstruction | num_candidates | micro-recall@1 | micro-recall@4 | micro-recall@10 | micro-recall@50 | micro-recall@100 |
---|---|---|---|---|---|---|---|
- | - | - | - | - | - | - | - |
16 | 200 | 50 | 0.977 | 0.9635 | 0.9533 | 0.93832 | 0.93512 |
The results are excellent. The RAM used is bellow 5 GB. And the time search is around 80 ms, which is acceptable for our use case and way bellow the 3 seconds for exact search with a FLAT faiss index.
ElasticSearch was thus chosen for computing the ANN search.