gbh qed

machine learning + computer vision + facts + proofs

Posts Tagged ‘fast

Fast Bit Counting

leave a comment »

Awhile ago, I was reading papers on various hashing algorithms, such as Weiss, Torralba, and Fergus’s Spectral Hashing.  One use for these algorithms is in computing approximate distance and nearest neighbors in very large data sets, where exact computation would be too slow.  Data points are instead mapped to bit codes that are constructed such that nearby points will map to similar bit codes.

One question this raised, and one which also apparently potential interview question, is what is a fast way to compute the number of one bits in a number (allowing you to compute the Hamming distance between bit codes).

It’s an interesting question, and this page – Puzzle: Fast Bit Counting – gives a number of clever algorithms for computing the number of one bits.

Of course, this isn’t necessarily relevant in the context of using the hashing to bit codes for computing nearest neighbors.  Instead, given a query, you’d probably want to find all data points within a Hamming distance of k.  For this, I imagine you would pre-compute all bit codes with k or less one bits, and then run through this list, XOR (^ in C) the query with the bit code to get a new code with Hamming distance less than or equal to k from the query, and then test to see if any data point was hashed to the new code.

Written by gbhqed

February 25, 2010 at 1:37 am