String operations

String operations

In computer science, in the area of formal language theory, frequent use is made of a variety of string functions; however, the notation used is different from that used on computer programming, and some commonly used functions in the theoretical realm are rarely used when programming. This article defines some of these basic terms.

Alphabet of a string

The alphabet of a string is a list of all of the letters that occur in a particular string. If "s" is a string, its alphabet is denoted by

:operatorname{Alph}(s)

tring substitution

Let "L" be a language, and let Sigma be its alphabet. A string substitution or simply a substitution is a mapping "f" that maps letters in Sigma to languages (possibly in a different alphabet). Thus, for example, given a letter ain Sigma, one has f(a)=L_a where L_asubsetDelta^* is some language whose alphabet is Delta. This mapping may be extended to strings as

:f(varepsilon)=varepsilon

for the empty string varepsilon, and

:f(sa)=f(s)f(a)

for string sin L. String substitution may be extended to the entire language as

:f(L)=igcup_{sin L} f(s)

An example of string substitution occurs in regular languages, which are closed under string substitution. That is, if the letters of a regular language are substituted by other regular languages, the result is still a regular language.

tring homomorphism

A string homomorphism is a string substitution such that each letter is replaced by a single string. That is, f(a)=s, where "s" is a string, for each letter "a". String homomorphisms are homomorphisms, preserving the binary operation of string concatenation. Given a language "L", the set f(L) is called the homomorphic image of "L". The inverse homomorphic image of a string "s" is defined as

:f^{-1}(s)={wvert f(w)=s}

while the inverse homomorphic image of a language "L" is defined as

:f^{-1}(L)={svert f(s)in L}

Note that, in general, f(f^{-1}(L)) e L, while one does have

:f(f^{-1}(L)) subseteq L

and

:L subseteq f^{-1}(f(L))

for any language "L". Simple single-letter substitution ciphers are examples of string homomorphisms.

tring projection

If "s" is a string, and Sigma is an alphabet, the string projection of "s" is the string that results by removing all letters which are not in Sigma. It is written as pi_Sigma(s),. It is formally defined by removal of letters from the right hand side:

:pi_Sigma(s) = egin{cases} varepsilon & mbox{if } s=varepsilon mbox{ the empty string} \pi_Sigma(t) & mbox{if } s=ta mbox{ and } a otin Sigma \ pi_Sigma(t)a & mbox{if } s=ta mbox{ and } a in Sigma end{cases}

Here varepsilon denotes the empty string. The projection of a string is essentially the same as a projection in relational algebra.

String projection may be promoted to the projection of a language. Given a formal language "L", its projection is given by

:pi_Sigma (L)={pi_Sigma(s) vert sin L }

Right quotient

The right quotient of a letter "a" from a string "s" is the truncation of the letter "a" in the string "s", from the right hand side. It is denoted as s/a. If the string does not have "a" on the right hand side, the result is the empty string. Thus:

:(sa)/ b = egin{cases} s & mbox{if } a=b \varepsilon & mbox{if } a e bend{cases}

The quotient of the empty string may be taken:

:varepsilon / a = varepsilon

Similarly, given a subset Ssubset M of a monoid M, one may define the quotient subset as

:S/a={sin M vert sain S}

Left quotients may be defined similarly, with operations taking place on the left of a string.

yntactic relation

The right quotient of a subset Ssubset M of a monoid M defines an equivalence relation, called the right syntactic relation of "S". It is given by

:sim_S ;,=, {(s,t)in M imes M vert S/s = S/t }

The relation is clearly of finite index (has a finite number of equivalence classes) if and only if the family right quotients is finite; that is, if

:{S/m vert min M}

is finite. In this case, "S" is a recognizable language, that is, a language that can be recognized by a finite state automaton. This is discussed in greater detail in the article on syntactic monoids.

Right cancellation

The right cancellation of a letter "a" from a string "s" is the removal of the first occurrence of the letter "a" in the string "s", starting from the right hand side. It is denoted as sdiv a and is recursively defined as

:(sa)div b = egin{cases} s & mbox{if } a=b \(sdiv b)a & mbox{if } a e bend{cases}

The empty string is always cancellable:

:varepsilon div a = varepsilon

Clearly, right cancellation and projection commute:

:pi_Sigma(s)div a = pi_Sigma(s div a )

Prefixes

The prefixes of a string is the set of all prefixes to a string, with respect to a given language:

:operatorname{Pref}_L(s) = {t vert s=tu mbox { for } uin L}

The prefix closure of a language is

:operatorname{Pref} (L) = igcup_{sin L} operatorname{Pref}_L(s)

A language is called prefix closed if operatorname{Pref} (L) = L. Clearly, the prefix closure operator is idempotent:

:operatorname{Pref} (operatorname{Pref} (L)) =operatorname{Pref} (L)

The prefix relation is a binary relation sqsubseteq such that ssqsubseteq t if and only if s in operatorname{Pref}_L(t).

Prefix grammars generate languages that are prefix-closed.

ee also

* String functions (programming)
* Levi lemma

References

* John E. Hopcroft and Jeffrey D. Ullman, "Introduction to Automata Theory, Languages and Computation", Addison-Wesley Publishing, Reading Massachusetts, 1979. ISBN 0-201-029880-X. "(See chapter 3.)"


Wikimedia Foundation. 2010.

Игры ⚽ Поможем написать реферат

Look at other dictionaries:

  • String (computer science) — In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet. In computer programming, a string is traditionally a sequence of… …   Wikipedia

  • String Quartet in Four Parts — is a string quartet by John Cage, composed in 1950. It was one of the last works Cage wrote that were not entirely aleatoric. Like Sonatas and Interludes for prepared piano (1946 8) and the ballet The Seasons (1947), this work explores ideas from …   Wikipedia

  • Comparison of programming languages (string functions) — String functions redirects here. For string functions in formal language theory, see String operations. Programming language comparisons General comparison Basic syntax Basic instructions Arrays …   Wikipedia

  • String (C++) — In the C++ programming language, the std::string class is a standard representation for a string of text. This class removes many of the problems introduced by C style strings by putting the onus of memory ownership on the string class rather… …   Wikipedia

  • Empty string — In computer science and formal language theory, the empty string (or null string)[1] is the unique string of length zero. It is denoted with λ or sometimes Λ or ε. The empty string is distinct from a null reference in that in an object oriented… …   Wikipedia

  • Approximate string matching — In computing, approximate string matching is the technique of finding approximate matches to a pattern in a string. The closeness of a match is measured in terms of the number of primitive operations necessary to convert the string into an exact… …   Wikipedia

  • Fuzzy string searching — Approximate string search is the name that is used for a category of techniques for finding strings that approximately match some given pattern string. It may also be known as approximate or inexact matching. Approximate string searching has two… …   Wikipedia

  • System Center Operations Manager Command Shell — Windows PowerShell Screenshot von PowerShell 1.0 Basisdaten Entwickler: Microsoft Corporation Aktuelle Version …   Deutsch Wikipedia

  • Rabin-Karp string search algorithm — The Rabin Karp algorithm is a string searching algorithm created by Michael O. Rabin and Richard M. Karp in 1987 that uses hashing to find a substring in a text. It is used for multiple pattern matching rather than single pattern matching. For… …   Wikipedia

  • Karplus-Strong string synthesis — is a method of physical modelling synthesis that loops a short waveform through a filtered delay line to simulate the sound of a hammered or plucked string or some types of percussion.Although it is useful to view this as a subtractive synthesis… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”