An autoscaling Bloom filter with ultra-low memory usage for PHP.
An autoscaling Bloom filter with ultra-low memory footprint for PHP. Ok Bloomer employs a novel layered filtering strategy that allows it to expand while maintaining an upper bound on the false positive rate. Each layer is comprised of a bitmap that remembers the hash signatures of the items inserted so far. If an item gets caught in the filter, then it has probably been seen before. However, if an item passes through the filter, then it definitely has never been seen before.
- Ultra-low memory footprint
- Autoscaling works on streaming data
- Bounded maximum false positive rate
- Open-source and free to use commercially
Install into your project using Composer:
$ composer require andrewdalpino/okbloomer
- PHP 7.4 or above
A probabilistic data structure that estimates the prior occurrence of a given item with a maximum false positive rate.
|1||maxFalsePositiveRate||0.01||float||The false positive rate to remain below.|
|2||numHashes||4||int, null||The number of hash functions used, i.e. the number of slices per layer. Set to
|3||layerSize||32000000||int||The size of each layer of the filter in bits.|
|4||hashFn||callable||'crc32'||The hash function that accepts a string token and returns an integer.|
use OkBloomer\BloomFilter; $filter = new BloomFilter(0.01, 4, 32000000); $filter->insert('foo'); echo $filter->exists('foo'); echo $filter->existsOrInsert('bar'); echo $filter->exists('bar');
true false true
To run the unit tests:
$ composer test
To run static code analysis:
$ composer analyze
To run the benchmarks:
$ composer benchmark
-  P. S. Almeida et al. (2007). Scalable Bloom Filters.