- Bloom filter
The

**Bloom filter**, conceived by Burton H. Bloom in 1970, is a space-efficientprobabilistic data structure that is used to test whether an element is a member of a set. False positives are possible, but false negatives are not. Elements can be added to the set, but not removed (though this can be addressed with a counting filter). The more elements that are added to the set, the larger the probability of false positives.**Example**Google

BigTable uses Bloom filters to reduce the disk lookups for non-existent rows or columns. Avoiding costly disk lookups considerably increases the performance of a database query operation.The Squid Web Proxy

Cache uses Bloom filters for so-called [*http://wiki.squid-cache.org/SquidFaq/CacheDigests cache digests*] .**Algorithm description**An

**empty Bloom filter**is abit array of "m" bits, all set to 0. There must also be "k" differenthash function s defined, each of which maps a key value to one of the "m" array positions.To

**add**an element, feed it to each of the "k" hash functions to get "k" array positions. Set the bits at all these positions to 1.To

**query**for an element (test whether it is in the set), feed it to each of the "k" hash functions to get "k" array positions. If any of the bits at these positions are 0, the element is not in the set – if it were, then all the bits would have been set to 1 when it was inserted. If all are 1, then either the element is in the set, or the bits have been set to 1 during the insertion of other elements.The requirement of designing "k" different independent hash functions can be prohibitive for large "k". For a good

hash function with a wide output, there should be little if any correlation between different bit-fields of such a hash, so this type of hash can be used to generate multiple "different" hash functions by slicing its output into multiple bit fields. Alternatively, one can pass "k" different initial values (such as 0, 1, ..., "k"-1) to a hash function that takes an initial value; or add (or append) these values to the key. For larger "m" and/or "k", independence among the hash functions can be relaxed with negligible increase in false positive rate (harvtxt|Dillinger|Manolios|2004a, harvtxt|Kirsch|Mitzenmacher|2006). Specifically, harvtxt|Dillinger|Manolios|2004b show the effectiveness of usingenhanced double hashing ortriple hashing , variants ofdouble hashing , to derive the "k" indices using simple arithmetic on two or three indices computed with independent hash functions.Unfortunately, removing an element from this simple Bloom filter is impossible. The element maps to "k" bits, and although setting any one of these "k" bits to zero suffices to remove it, this has the side effect of removing any other elements that map onto that bit, and we have no way of determining whether any such elements have been added. Such removal would introduce a possibility for false negatives, which are not allowed.

Removal of an element from a Bloom filter can be simulated by having a second Bloom filter that contains items that have been removed. However, false positives in the second filter become false negatives in the composite filter, which are not permitted. This approach also limits the semantics of removal since re-adding a previously removed item is not possible.

However, it is often the case that all the keys are available but are expensive to enumerate (for example, requiring many disk reads). When the false positive rate gets too high, the filter can be regenerated; this should be a relatively rare event.

**pace and time advantages**While risking false positives, Bloom filters have a strong space advantage over other data structures for representing sets, such as

self-balancing binary search tree s,trie s,hash table s, or simplearray s orlinked list s of the entries. Most of these require storing at least the data items themselves, which can require anywhere from a small number of bits, for small integers, to an arbitrary number of bits, such as for strings (tries are an exception, since they can share storage between elements with equal prefixes). Linked structures incur an additional linear space overhead for pointers. A Bloom filter with 1% error and an optimal value of "k", on the other hand, requires only about 9.6 bits per element — regardless of the size of the elements. This advantage comes partly from its compactness, inherited fromarray s, and partly from its probabilistic nature. If a 1% false positive rate seems too high, each time we add about 4.8 bits per element we decrease it by ten times.However, if the number of potential values is small and many of them can be in the set, then the Bloom filter is easily surpassed by the deterministic

bit array , which requires only one bit for each potential element. Note also that hash tables gain a space and time advantage if they begin ignoring collisions and only store whether each bucket contains an entry; in this case, they have effectively become Bloom filters with "k" = 1.Bloom filters also have the unusual property that the time needed to either add items or to check whether an item is in the set is a fixed constant, O("k"), completely independent of the number of items already in the set. No other constant-space set data structure has this property, but the average access time of sparse

hash table s can make them faster in practice than some Bloom filters. In a hardware implementation, however, the Bloom filter shines because its "k" lookups are independent and can be parallelized.To understand its space efficiency, it is instructive to compare the general Bloom filter with its special case when "k" = 1. If "k" = 1, then in order to keep the false positive rate sufficiently low, a small fraction of bits should be set, which means the array must be very large and contain long runs of zeros. The

information content of the array relative to its size is low. The generalized Bloom filter ("k" greater than 1) allows many more bits to be set while still maintaining a low false positive rate; if the parameters ("k" and "m") are chosen well, about half of the bits will be set, and these will be apparently random, minimizing redundancy and maximizing information content.**Probability of false positives**Assume that a

hash function selects each array position with equal probability. If "m" is the number of bits in the array, the probability that a certain bit is not set to one by a certain hash function during the insertion of an element is then:$1-frac\{1\}\{m\}.$

The probability that it is not set by any of the hash functions is

:$left(1-frac\{1\}\{m\}\; ight)^k.$

If we have inserted "n" elements, the probability that a certain bit is still 0 is

:$left(1-frac\{1\}\{m\}\; ight)^\{kn\};$

the probability that it is 1 is therefore

:$1-left(1-frac\{1\}\{m\}\; ight)^\{kn\}.$

Now test membership of an element that is not in the set. Each of the "k" array positions computed by the hash functions is 1 with a probability as above. The probability of all of them being 1, which would cause the

algorithm to erroneously claim that the element is in the set, is then:$left(1-left(1-frac\{1\}\{m\}\; ight)^\{kn\}\; ight)^k\; approx\; left(\; 1-e^\{-kn/m\}\; ight)^k.$

Obviously, the probability of false positives decreases as "m" (the number of bits in the array) increases, and increases as "n" (the number of inserted elements) increases. For a given "m" and "n", the value of "k" (the number of hash functions) that minimizes the probability is

:$frac\{m\}\{n\}ln\; 2\; approx\; frac\{9m\}\{13n\}\; approx\; 0.7frac\{m\}\{n\},$

which gives the false positive probability of

:$left(frac\{1\}\{2\}\; ight)^\{k\}\; approx\; \{0.6185\}^\{m/n\}.$

**Interesting properties***Unlike sets based on

hash table s, any Bloom filter can represent the entire universe of elements. In this case, all bits are 1. Another consequence of this property is that**add**never fails due to the data structure "filling up," although the false positive rate increases steadily as elements are added.*Union and intersection of Bloom filters with the same size and set of hash functions can be implemented with bitwise OR and AND operations, respectively.

**Alternatives to Bloom filters**Classic Bloom filters use $1.44log2(1/epsilon)$ bits of space per inserted key, where $epsilon$ is the false positive rate of the Bloom filter. However the space that is strictly necessary for any data structure playing the same role as a Bloom filter is only $log2(1/epsilon)$ per key harv|Pagh|Pagh|Rao|2005. Hence Bloom filters use 44% more space than a hypothetical equivalent optimal data structure. The number of hash functions used to achieve a given false positive rate $epsilon$ is proportional to $1/epsilon$ which is not optimal as it has been proved that an optimal data structure would need only a constant number of hash functions independent of the false positive rate.

harvtxt|Stern|Dill|1996 describe a probabilistic structure based on

hash table s,hash compaction , which harvtxt|Dillinger|Manolios|2004b identify as significantly more accurate than a Bloom filter when each is configured optimally. Dillinger and Manolios, however, point out that the reasonable accuracy of any given Bloom filter over a wide range of numbers of additions makes it attractive for probabilistic enumeration of state spaces of unknown size. Hash compaction is, therefore, attractive when the number of additions can be predicted accurately; however, despite being very fast in software, hash compaction is poorly-suited for hardware because of worst-case linear access time.harvtxt|Putze|Sanders|Singler|2007 have studied some variants of Bloom filters that are either faster or use less space than classic Bloom filters. The basic idea of the fast variant is to locate the k hash values associated with each key into one or two blocks having the same size as processor's memory cache blocks (usually 64 bytes). This will presumably improve performance by reducing the number of potential memory

cache misses . The proposed variants have however the drawback of using about 32% more space than classic Bloom filters.The space efficient variant relies on using a single hash function that generates for each key a value in the range $left\; [0dotsfrac\{n\}\{varepsilon\}\; ight]$ where $epsilon$ is the requested false positive rate. The sequence of values is then sorted and compressed using

golomb coding (or some other compression technique) to occupy a space close to $nlog2(1/epsilon)$ bits. To query the Bloom filter for a given key, it will suffice to check if its corresponding value is stored in the Bloom filter. Decompressing the whole Bloom filter for each query would make this variant totally unusable. To overcome this problem the sequence of values is divided into small blocks of equal size that are compressed separately. At query time only half a block will need to be decompressed on average. Because of decompression overhead, this variant may be slower than classic Bloom filters but this may be compensated by the fact that a single hash function need to be computed.Another alternative to classic Bloom filter is the one based on space efficient variants of

cuckoo hashing . In this case once the hash table is constructed, the keys stored in the hash table are replaced with short signatures of the keys. Those signatures are strings of bits computed using a hash function applied on the keys.**Counting filters**Counting filters provide a way to implement a "delete" operation on a Bloom filter without recreating the filter afresh. In a counting filter the array positions (buckets) are extended from being a single bit, to being an n-bit counter. In fact, regular Bloom filters can be considered as counting filters with a bucket size of one bit. Counting filters were introduced by harvtxt|Fan|Cao|Almeida|Broder|1998.

The insert operation is extended to "increment" the value of the buckets and the lookup operation checks that each of the required buckets is non-zero. The delete operation, obviously, then consists of decrementing the value of each of the respective buckets.

Arithmetic overflow of the buckets is a problem and the buckets should be sufficiently large to make this case rare. If it does occur then the increment and decrement operations must leave the bucket set to the maximum possible value in order to retain the properties of a Bloom filter.The size of counters is usually 3 or 4 bits. Hence counting Bloom filters use 3 to 4 times more space than static Bloom filters. In theory, an optimal data structure equivalent to a counting Bloom filter should not use more space than a static Bloom filter.

Another issue with counting filters is in their scalability. Because the counting Bloom filter table can not be resized, the maximal number of keys to be stored simultaneously in the filter must be known in advance. Once the designed capacity of the table is exceeded the false positive rate will grow rapidly as more keys are inserted.

harvtxt|Bonomi|Mitzenmacher|Panigrahy|Singhand|2006 introduced a Data structure based on d-left hashing that is functionally equivalent but uses approximately half as much space as counting Bloom filters. The scalabilityissue does not occur in this data structure. Once the designed capacity is exceeded, the keys could be reinserted in a new hash table of double size.

The space efficient variant by harvtxt|Putze|Sanders|Singler|2007 could also be used to implement counting filters by supporting insertions and deletions.

**Bloomier filters**harvtxt|Chazelle|Kilian|Rubinfeld|Tal|2004 designed a generalization of Bloom filters that could associate a value with each element that had been inserted, implementing an

associative array . Like Bloom filters, these structures achieve a small space overhead by accepting a small probability of false positives. In the case of "Bloomier filters", a "false positive" is defined as returning a result when the key is not in the map. The map will never return the wrong value for a key that "is" in the map.The simplest Bloomier filter is near-optimal and fairly simple to describe. Suppose initially that the only possible values are 0 and 1. We create a pair of Bloom filters "A"

_{0}and "B"_{0}which contain, respectively, all values mapping to 0 and all values mapping to 1. Then, to determine which value a given key maps to, we look it up in both filters. If it is in neither, then the key is not in the map. If the key is in "A"_{0}but not "B"_{0}, then it does not map to 1, and has a high probability of mapping to 0. Conversely, if the key is in "B"_{0}but not "A"_{0}, then it does not map to 0 and has a high probability of mapping to 1.A problem arises, however, when "both" filters claim to contain the item. We never insert an item into both, so one or both of the filters is lying (producing a false positive), but we don't know which. To determine this, we have another, smaller pair of filters "A"

_{1}and "B"_{1}. "A"_{1}contains values that map to 0 and which are false positives in "B"_{0}; "B"_{1}contains values that map to 1 and which are false positives in "A"_{0}. But whenever "A"_{0}and "B"_{0}both produce positives, at most one of these cases must occur, and so we simply have to determine which if any of the two filters "A"_{1}and "B"_{1}contains the key, another instance of our original problem.It may so happen again that both filters produce a positive; we apply the same idea recursively to solve this problem. Because each pair of filters only contains keys that are in the map "and" produced false positives on all previous filter pairs, the number of keys is extremely likely to quickly drop to a very small quantity that can be easily stored in an ordinary deterministic map, such as a pair of small arrays with linear search. Moreover, the average total search time is a constant, because almost all queries will be resolved by the first pair, almost all remaining queries by the second pair, and so on. The total space required is independent of "n", and is almost entirely occupied by the first filter pair.

Now that we have the structure and a search algorithm, we also need to know how to insert new key/value pairs. The program must not attempt to insert the same key with both values. If the value is 0, insert the key into "A"

_{0}and then test if the key is in "B"_{0}. If so, this is a false positive for "B"_{0}, and the key must also be inserted into "A"_{1}recursively in the same manner. If we reach the last level, we simply insert it. When the value is 1, the operation is similar but with "A" and "B" reversed.Now that we can map a key to the value 0 or 1, how does this help us map to general values? This is simple. We create a single such Bloomier filter for each bit of the result. If the values are large, we can instead map keys to hash values that can be used to retrieve the actual values. The space required for a Bloomier filter with "n"-bit values is typically slightly more than the space for 2"n" Bloom filters.

A very simple way to implement Bloomier filters is by means of minimal

perfect hashing . A minimal perfect hash function h is first generated for the set of n keys. Then an array is filled with n pairs (signature,value) associated with each key at the positions given by function h when applied on each key. The signature of a key is a string of r bits computed by applying a hash function g of range $2^r$ on the key. The value of r is chosen such that $2^r>=1/epsilon$, where $epsilon$ is the requested false positive rate. To query for a given key, hash function h is first applied on the key. This will give a position into the array from which we retrieve a pair (signature,value). Then we compute the signature of the key using function g. If the computed signature is the same as retrieved signature we return the retrieved value. The probabaility of false positive is $1/2^r$.Dynamic Bloomier filters have been studied by harvtxt|Mortensen|Pagh|Pătraşcu|2005. They proved that any dynamic Bloomier filter needs at least around$log(l)$ bits per key where l is the length of the key. A simple dynamic version of Bloomier filters can be implemented using two dynamic data structures. Let the two data structures be noted S1 and S2. S1 will store keys with their associated data while S2 will only store signatures of keys with their associated data. Those signatures are simply hash values of keys in the range $[0..frac\{n\}\{epsilon\}]$ where n is the maximal number of keys to be stored in the Bloomier filter and $epsilon$ is the requested false positive rate. To insert a key in the Bloomier filter, its hash value is first computed. Then the algorithm checks if a key with the same hash value already exists in S2. If this is not the case, the hash value is inserted in S2 along with data associated with the key. If the same hash value already exists in S2 then the key is inserted into S1 along with its associated data. The deletion is symmetric : if the key already exists in S1 it will be deleted from there, otherwise the hash value associated with the key is deleted from S2. An issue with this algorithm is on how to store efficiently S1 and S2. For S1 any hash algorithm can be used. To store S2

golomb coding could be applied to compress signatures to use a space close to $log2(1/epsilon)$ per key.**Compact approximators**harvtxt|Boldi|Vigna|2005 proposed a lattice-based generalization of Bloom filters. A

**compact approximator**associates to each key an element of a lattice (the standard Bloom filters being the case of the Boolean two-element lattice). Instead of a bit array, they have an array of lattice elements. When adding a new association between a key and an element of the lattice, they maximize the current content of the`k`array locations associated to the key with the lattice element. When reading the value associated to a key, they minimize the values found in the`k`locations associated to the key. The resulting value approximates from above the original value.**References**

*citation

first = Burton H. | last = Bloom

title = Space/time trade-offs in hash coding with allowable errors

journal =Communications of the ACM

volume = 13 | issue = 7 | year = 1970 | pages = 422–426

doi = 10.1145/362686.362692*citation

first1 = Paolo | last1 = Boldi

first2 = Sebastiano | last2 = Vigna

title = Mutable strings in Java: design, implementation and lightweight text-search algorithms

journal = Science of Computer Programming

volume = 54 | issue = 1 | year = 2005 | pages = 3–23

doi = 10.1016/j.scico.2004.05.003*citation

first1 = Flavio | last1 = Bonomi

first2 = Michael | last2 = Mitzenmacher

first3 = Rina | last3 = Panigrahy

first4 = Sushil | last4 = Singh

first5 = George |last5 = Varghese

contribution = An Improved Construction for Counting Bloom Filters

title = Algorithms – ESA 2006, 14th Annual European Symposium

year = 2006 | pages = 684–695 | doi = 10.1007/11841036_61*citation

first1 = Andrei | last1 = Broder | authorlink1 = Andrei Broder

first2 = Michael | last2 = Mitzenmacher

title = Network Applications of Bloom Filters: A Survey

journal = Internet Mathematics

volume = 1 | issue = 4 | pages = 485–509 | year = 2005

url = http://www.eecs.harvard.edu/~michaelm/postscripts/im2005b.pdf*citation

first1 = Bernard | last1 = Chazelle | authorlink1 = Bernard Chazelle

first2 = Joe | last2 = Kilian

first3 = Ronitt | last3 = Rubinfeld

first4 = Ayellet | last4 = Tal

contribution = The Bloomier filter: an efficient data structure for static support lookup tables

title = Proceedings of the Fifteenth Annual ACM-SIAM Symposium on Discrete Algorithms

year = 2004 | pages = 30–39

url = http://www.ee.technion.ac.il/~ayellet/Ps/nelson.pdf*citation

first1 = Saar | last1 = Cohen

first2 = Yossi | last2 = Matias

contribution = Spectral Bloom Filters

title = Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data

year = 2003 | doi = 10.1145/872757.872787 | pages = 241–252

url = http://www.sigmod.org/sigmod03/eproceedings/papers/r09p02.pdf*citation

first1 = Sarang | last1 = Dharmapurikar

first2 = Haoyu | last2 = Song

first3 = Jonathan | last3 = Turner

first4 = John | last4 = Lockwood

contribution = Fast packet classification using Bloom filters

title = Proceedings of the 2006 ACM/IEEE Symposium on Architecture for Networking and Communications Systems

year = 2006 | pages = 61–70 | doi = 10.1145/1185347.1185356

url = http://www.arl.wustl.edu/~sarang/ancs6819-dharmapurikar.pdf*citation

first1 = Peter C. | last1 = Dillinger

first2 = Panagiotis | last2 = Manolios

contribution = Fast and Accurate Bitstate Verification for SPIN

title = Proceedings of the 11th Internation Spin Workshop on Model Checking Software

publisher = Springer-Verlag, Lecture Notes in Computer Science 2989

year = 2004a

url = http://www.cc.gatech.edu/fac/Pete.Manolios/research/spin-3spin.html*citation

first1 = Peter C. | last1 = Dillinger

first2 = Panagiotis | last2 = Manolios

contribution = Bloom Filters in Probabilistic Verification

title = Proceedings of the 5th Internation Conference on Formal Methods in Computer-Aided Design

publisher = Springer-Verlag, Lecture Notes in Computer Science 3312

year = 2004b

url = http://www.cc.gatech.edu/fac/Pete.Manolios/research/bloom-filters-verification.html*citation

first1 = Benoit | last1 = Donnet

first2 = Bruno | last2 = Baynat

first3 = Timur | last3 = Friedman

contribution = Retouched Bloom Filters: Allowing Networked Applications to Flexibly Trade Off False Positives Against False Negatives

title = CoNEXT 06 – 2nd Conference on Future Networking Technologies | year = 2006

url = http://adetti.iscte.pt/events/CONEXT06/Conext06_Proceedings/papers/13.html*citation

first1 = David | last1 = Eppstein | authorlink1 = David Eppstein

first2 = Michael T. | last2 = Goodrich

contribution = Space-efficient straggler identification in round-trip data streams via Newton's identities and invertible Bloom filters

title = Algorithms and Data Structures, 10th International Workshop, WADS 2007

publisher = Springer-Verlag, Lecture Notes in Computer Science 4619

year = 2007 | pages = 637–648, arxiv | 0704.3313*citation

first1 = Li | last1 = Fan | first2 = Pei | last2 = Cao

first3 = Jussara | last3 = Almeida

first4 = Andrei | last4 = Broder | authorlink4 = Andrei Broder

title = Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol

journal = IEEE/ACM Transactions on Networking

volume = 8 | issue = 3 | year = 2000 | pages = 281–293 | doi = 10.1109/90.851975. A preliminary version appeared at SIGCOMM '98.*citation

first1 = Adam |last1 = Kirsch | first2 = Michael | last2 = Mitzenmacher

contribution = Less Hashing, Same Performance: Building a Better Bloom Filter

title = Algorithms – ESA 2006, 14th Annual European Symposium

publisher = Springer-Verlag, Lecture Notes in Computer Science 4168

year = 2006 | pages = 456–467 | doi = 10.1007/11841036

url = http://www.eecs.harvard.edu/~kirsch/pubs/bbbf/esa06.pdf*citation

first1 = Christian Worm | last1 = Mortensen

first2 = Rasmus | last2 = Pagh

first3 = Mihai | last3 = Pătraşcu

contribution = On dynamic range reporting in one dimension

title = Proceedings of the Thirty-seventh Annual ACM Symposium on Theory of Computing

year = 2005 | pages = 104–111 | doi = 10.1145/1060590.1060606*citation

first1 = Anna | last1 = Pagh

first2 = Rasmus | last2 = Pagh

first3 = S. Srinivasa | last3 = Rao

contribution = An optimal Bloom filter replacement

title = Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms

year = 2005 | pages = 823–829

url = http://www.it-c.dk/people/pagh/papers/bloom.pdf*citation

first1 = F. | last1 = Putze

first2 = P. | last2 = Sanders

first3 = J. | last3 = Singler

contribution = Cache-, Hash- and Space-Efficient Bloom Filters

title = Experimental Algorithms, 6th International Workshop, WEA 2007

publisher = Springer-Verlag, Lecture Notes in Computer Science 4525

year = 2007 | pages = 108–121 | doi = 10.1007/978-3-540-72845-0

url = http://algo2.iti.uni-karlsruhe.de/singler/publications/cacheefficientbloomfilters-wea2007.pdf*citation

first1 = Simha | last1 = Sethumadhavan

first2 = Rajagopalan | last2 = Desikan

first3 = Doug | last3 = Burger

first4 = Charles R. | last4 = Moore

first5 = Stephen W. | last5 = Keckler

contribution = Scalable hardware memory disambiguation for high ILP processors

title = 36th Annual IEEE/ACM International Symposium on Microarchitecture, 2003, MICRO-36

year = 2003 | pages = 399–410 | doi = 10.1109/MICRO.2003.1253244

url = http://www.cs.utexas.edu/users/simha/publications/lsq.pdf*citation

first1 = Kulesh | last1 = Shanmugasundaram

first2 = Hervé | last2 = Brönnimann

first3 = Nasir | last3 = Memon

contribution = Payload attribution via hierarchical Bloom filters

title = Proceedings of the 11th ACM Conference on Computer and Communications Security

year = 2004 | pages = 31–41 | doi = 10.1145/1030083.1030089*citation

first1 = Ulrich | last1 = Stern

first2 = David L. | last2 = Dill

contribution = A New Scheme for Memory-Efficient Probabilistic Verification

title = Proceedings of Formal Description Techniques for Distributed Systems and Communication Protocols, and Protocol Specification, Testing, and Verification: IFIP TC6/WG6.1 Joint International Conference

publisher = Chapman & Hall, IFIP Conference Proceedings

year = 1996 | pages = 333-348

url = http://sprout.stanford.edu/dill/PAPERS/verification/SD96A.ps*citation

first1 = Fay| last1 = Chang

first2 = Jeffrey| last2 = Dean

first3 = Sanjay | last3 = Ghemawat

first4 = Wilson | last4 = Hsieh

first5 = Deborah | last5 = Wallach

first6 = Mike | last6 = Burrows

first7 = Tushar | last7 = Chandra

first8 = Andrew | last8 = Fikes

first9 = Robert | last9 = Gruber

contribution = Bigtable: A Distributed Storage System for Structured Data

title = Seventh Symposium on Operating System Design and Implementation

year = 2006 | url = http://labs.google.com/papers/bigtable.html

*citation

first1 = Mahmood| last1 = Ahmadi

first2 = Stephan| last2 = Wong

| contribution = A Cache Architecture for Counting Bloom Filters

title = 15th international Conference on Netwroks (ICON-2007)

year = 2007 | url = http://www.ieeexplore.ieee.org/xpls/abs_all.jsp?isnumber=4444031&arnumber=4444089&count=113&index=57**External links*** [

*http://libhashish.sf.net C/C++ library with comprehensive analysis functionality and documentation*]

* [*http://www.cs.wisc.edu/~cao/papers/summary-cache/node8.html Table of false-positive rates for different configurations*]

* [*http://www.cc.gatech.edu/fac/Pete.Manolios/bloom-filters/calculator.html Online Bloom filter calculator*]

* [*http://blogs.sun.com/bblfish/entry/my_bloomin_friends Bloom Filters and Social Networks with Java applet demo*]**Implementations*** [

*http://www.partow.net/programming/hashfunctions/index.html Implementation in C++ and Object Pascal*]

* [*http://code.google.com/p/bloomerl/ Implementation in Erlang*]

* [*http://hackage.haskell.org/cgi-bin/hackage-scripts/package/bloomfilter Implementation in Haskell*]

* [*http://wwwse.inf.tu-dresden.de/xsiena/bloom_filter Implementation in Java*]

* [*http://la.ma.la/misc/js/bloomfilter/ Implementation in Javascript*]

* [*http://lemonodor.com/archives/000881.html Implementation in Lisp*]

* [*http://search.cpan.org/dist/Bloom-Filter/ Implementation in Perl*]

* [*http://www.imperialviolet.org/pybloom.html Implementation in Python*]

* [*http://www.rubyinside.com/bloom-filters-a-powerful-tool-599.html Implementation in Ruby*]

*Wikimedia Foundation.
2010.*

### Look at other dictionaries:

**Bloom-filter**— sind probabilistische Datenstrukturen, mit deren Hilfe sehr schnell festgestellt werden kann, welche Daten in einem kontinuierlich fließenden Datenstrom schon einmal vorgekommen sind und welche erstmals auftreten. Hierzu wird mit einer geeigneten … Deutsch Wikipedia**Bloom**— The term bloom usually refers to the general expression describing the aesthetic experience of one or more flowers on a flowering plant. Also used as a metaphor for young people at the peak of their beauty or health. See also Blossom.Bloom or… … Wikipedia**Filtre De Bloom**— Le filtre de Bloom, conçu par Burton H. Bloom en 1970, est une structure de données probabiliste qui optimise l espace utilisé. Cette structure est utilisée pour tester si un élément fait partie d un ensemble. Il permet notamment d optimiser les… … Wikipédia en Français**Filtre de bloom**— Le filtre de Bloom, conçu par Burton H. Bloom en 1970, est une structure de données probabiliste qui optimise l espace utilisé. Cette structure est utilisée pour tester si un élément fait partie d un ensemble. Il permet notamment d optimiser les… … Wikipédia en Français**Atomic line filter**— A potassium Faraday filter designed, built and photographed by Jonas Hedin for making daytime LIDAR measurements at Arecibo Observatory.[1] An atomic line filter (ALF) is an advanced optical band pass filter used in the physical sciences for… … Wikipedia**Algal bloom**— Algal blooms can present problems for ecosystems and human society An algal bloom is a rapid increase or accumulation in the population of algae (typically microscopic) in an aquatic system. Algal blooms may occur in freshwater as well as marine… … Wikipedia**Hash filter**— A hash filter creates a hash sum from data, typically e mail, and compares the sum against other previously defined sums. Depending on the purpose of the filter, the data can then be included or excluded in a function based on whether it matches… … Wikipedia**Bloomfilter**— Bloom Filter sind probabilistische Datenstrukturen, mit deren Hilfe sehr schnell festgestellt werden kann, welche Daten in einem kontinuierlich fließenden Datenstrom schon einmal vorgekommen sind und welche erstmals auftreten. Hierzu wird mit… … Deutsch Wikipedia**Bit array**— A bit array (or bitmap, in some cases) is an array data structure which compactly stores individual bits (boolean values). It implements a simple set data structure storing a subset of {1,2,..., n } and is effective at exploiting bit level… … Wikipedia**Cuckoo hashing**— example. The arrows show the alternative location of each key. A new item would be inserted in the location of A by moving A to its alternative location, currently occupied by B, and moving B to its alternative location which is currently vacant … Wikipedia