Cybernetics and Systems Analysis / Issue (2017, 53 (1))
Rachkovskij D.A.
Binary vectors for fast distance and similarity estimation This review considers methods and algorithms for fast estimation of distance/similarity measures between initial data from vector representations with binary or integer-valued components obtained from initial data that are mainly high-dimensional vectors with different distance measures (angular, Euclidean, and others) and similarity measures (cosine, inner product, and others). Methods without learning that mainly use random projections with the subsequent quantization and also sampling methods are discussed. The obtained vectors can be applied in similarity search, machine learning, and other algorithms. © 2017, Springer Science+Business Media New York. Keywords: binarization, distance, embedding, Johnson–Lindenstrauss lemma, kernel similarity, locality-sensitive hashing, quantization, random projection, sampling, similarity, similarity search, sketch, Bins, Learning systems, Sampling, Binarizations, distance, embedding, kernel similarity, Locality sensitive hashing, quantization, Random projections, similarity, Similarity search, sketch, Vectors
Cite: Rachkovskij D.A.
(2017). Binary vectors for fast distance and similarity estimation. Cybernetics and Systems Analysis, 53 (1), 160-183. doi: https://doi.org/10.1007/s10559-017-9914-x http://jnas.nbuv.gov.ua/article/UJRN-0000621682 [In Russian]. |