Boyer–Moore string search algorithm


Boyer–Moore string search algorithm

The Boyer–Moore string search algorithm is a particularly efficient string searching algorithm, and it has been the standard benchmark for the practical string search literature. [Hume and Sunday (1991) " [Fast String Searching] " SOFTWARE—PRACTICE AND EXPERIENCE, VOL. 21(11), 1221–1248 (NOVEMBER 1991)] It was developed by Bob Boyer and J Strother Moore in 1977. The algorithm preprocesses the target string (key) that is being searched for, but not the string being searched (unlike some algorithms that preprocess the string to be searched and can then amortize the expense of the preprocessing by searching repeatedly). The execution time of the Boyer-Moore algorithm can be sub-linear: it doesn't need to check every character of the string to be searched, but rather skips over some of them. Generally the algorithm gets faster as the key being searched for becomes longer. Its efficiency derives from the fact that with each unsuccessful attempt to find a match between the search string and the text it's searching, it uses the information gained from that attempt to rule out as many positions of the text as possible where the string cannot match.

How the algorithm works

The mismatch "A" in position 5 (3 back from the last letter of the needle) excludes the first 6 of the possible starting positions shown.

The second table is slightly more difficult to calculate: for each value of "i" less than the length of the search string, we must first calculate the pattern consisting of the last "i" characters of the search string, preceded by a "mis"-match for the character before it; then weinitially line it up with the search pattern and determine the least number of characters the partial pattern must be shifted left before the two patterns match. For instance, for the search string ANPANMAN, the table would be as follows: (N signifies any character that is not N)

The amount of shift calculated by the second table is sometimes called the "good suffix shift" [http://www.movsd.com/bm.htm] or "(strong) good suffix rule". The original published Boyer-Moore algorithm [.

Uses

Wikipedia uses a patched version of PHP which uses an implementation of Boyer-Moore to speed up page loading. Wikipedia database developer Domas Mituzas discussed Wikipedia's use of the Boyer-Moore algorithm in a 2007 presentation:

"Fast String Search" is a replacement module for PHP’s strtr() function, which uses a Commentz-Walter–style algorithm for multiple search terms, or the Boyer-Moore algorithm for single search terms. License collisions (GPL code was used for it) do not allow its inclusion in PHP. Using a proper algorithm instead of foreach loops is an incredible boost for some applications. [D. Mituzas, "Wikipedia: Site internals, configuration, code examples and management issues", http://dammit.lt/uc/workbook2007.pdf]

ee also

* Knuth–Morris–Pratt algorithm

References

External links

* [http://www.cs.utexas.edu/~moore/publications/fstrpos.pdf Original article]
* [http://www-sr.informatik.uni-tuebingen.de/~buehler/BM/BM1.html Animation] of the Boyer-Moore algorithm
* [http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/fstrpos-example.html An example of the Boyer-Moore algorithm] from the homepage of J Strother Moore, co-inventor of the algorithm
* [http://www-igm.univ-mlv.fr/%7Elecroq/string/node14.html An explanation of the algorithm (with sample C code)]
* [http://www.cs.nyu.edu/cs/faculty/cole/papers/CHPZ95.ps Cole et al, Tighter lower bounds on the exact complexity of string matching]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Rabin-Karp string search algorithm — The Rabin Karp algorithm is a string searching algorithm created by Michael O. Rabin and Richard M. Karp in 1987 that uses hashing to find a substring in a text. It is used for multiple pattern matching rather than single pattern matching. For… …   Wikipedia

  • String searching algorithm — String searching algorithms, sometimes called string matching algorithms, are an important class of string algorithms that try to find a place where one or several strings (also called patterns) are found within a larger string or text. Let Σ be… …   Wikipedia

  • Boyer–Moore–Horspool algorithm — In computer science, the Boyer–Moore–Horspool algorithm or Horspool s algorithm is an algorithm for finding substrings in strings. It is a simplification of the Boyer Moore algorithm which is related to the Knuth Morris Pratt algorithm. The… …   Wikipedia

  • Algoritmo de búsqueda de cadenas Boyer-Moore — El algoritmo de búsqueda de cadenas Boyer Moore es un particularmente eficiente algoritmo de búsqueda de cadenas, y ha sido el punto de referencia estándar para la literatura de búsqueda de cadenas práctica.[1] Fue desarrollado por Bob Boyer y J… …   Wikipedia Español

  • Algorithme De Boyer-Moore — L algorithme de Boyer Moore est un algorithme de recherche de sous chaîne particulièrement efficace. Il a été développé par Bob Boyer et J. Strother Moore en 1977. Sommaire 1 Présentation 2 Fonctionnement de l algorithme 2.1 Pré traitement …   Wikipédia en Français

  • Algorithme de boyer-moore — L algorithme de Boyer Moore est un algorithme de recherche de sous chaîne particulièrement efficace. Il a été développé par Bob Boyer et J. Strother Moore en 1977. Sommaire 1 Présentation 2 Fonctionnement de l algorithme 2.1 Pré traitement …   Wikipédia en Français

  • Algorithme de recherche de chaîne de caractères de Boyer-Moore — Algorithme de Boyer Moore L algorithme de Boyer Moore est un algorithme de recherche de sous chaîne particulièrement efficace. Il a été développé par Bob Boyer et J. Strother Moore en 1977. Sommaire 1 Présentation 2 Fonctionnement de l algorithme …   Wikipédia en Français

  • Algorithme de recherche de sous-chaîne de Boyer-Moore — Algorithme de Boyer Moore L algorithme de Boyer Moore est un algorithme de recherche de sous chaîne particulièrement efficace. Il a été développé par Bob Boyer et J. Strother Moore en 1977. Sommaire 1 Présentation 2 Fonctionnement de l algorithme …   Wikipédia en Français

  • Boyer — is a French, English, German and even Turkish last name. Boyer in English comes from bowyer, meaning bow maker or bow seller. In French, it means ox leader , the one leading the ox while plowing. In Turkish, it comes from boy er, boy meaning size …   Wikipedia

  • Moore — Contents 1 People 2 Places 3 Science 4 Programming …   Wikipedia


We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.