Wednesday, 15 March 2017

Bloom Filter hash is returning far too many collisions

https://github.com/connorjburton/es6-spelling-correction/blob/master/src/BloomFilter.js

You can run the above code with npm run index

I have been trying to implement my own (simple) Bloom Filter but am stuck on hashing, I understand the concept of hashing the item multiple times and populating the bit array with the indices.

However, I am seeing a ton of collisions in my hashing, I am using 1 hash algorithm (I have tried FNV, murmurhash and now farmhash) with various seeds (based on current nanoseconds).

I must be doing something wrong, I am calculating the k functions by following the information here and setting the same amount of seeds.

Any help would be great, thanks.



via Connor Burton

No comments:

Post a Comment