Parse tree


Parse tree

A concrete syntax tree or parse tree or parsing tree[1] is an ordered, rooted tree that represents the syntactic structure of a string according to some formal grammar. In a parse tree, the interior nodes are labeled by non-terminals of the grammar, while the leaf nodes are labeled by terminals of the grammar. Parse trees may be generated for sentences in natural languages (see natural language processing), as well as during processing of computer languages, such as programming languages. Parse trees are distinct from abstract syntax trees (also known simply as syntax trees), in that their structure and elements more concretely reflect the syntax of the input language.

If a formal grammar is ambiguous, there can be multiple parse trees for some given string (Syntactic ambiguity).

Contents

Basic description

A parse tree is made up of nodes and branches. The image below represents a linguistic parse tree, here representing the English sentence "John hit the ball". (The parse tree here is a greatly simplified one; for more information, see X-bar theory.) The parse tree is the entire structure, starting from S and ending in each of the leaf nodes (John, hit, the, ball). We use the following abbreviations in the example:

A simple parse tree
Hybrid constituency/dependency tree from the Quranic Arabic Corpus

In a parse tree, each node is either a root node, a branch node, or a leaf node. In the example to the right, S is a root node, NP and VP are branch nodes, while John, hit, the, and ball are all leaf nodes. The leaves are the lexical tokens of the sentence.[2]

A node can also be referred to as parent node or a child node. A parent node is one that has at least one other node linked by a branch under it. In the example, S is a parent of both NP and VP. A child node is one that has at least one node directly above it to which it is linked by a branch of the tree. Again from our example, hit is a child node of V. The terms mother and daughter are also sometimes used for this relationship.

See also

References

  1. ^ Chiswell, Ian; Hodges, Wilfrid (2007). Mathematical Logic. Oxford: Oxford University Press. p. 34. ISBN 9780199215621. 
  2. ^ Aho, Alfred V. et al. Compilers: Principles, Techniques, & Tools. Boston: Pearson/Addison Wesley, 2007. Print.

External links


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Tree diagram — The term tree diagram is used in different ways in different disciplines. * In mathematics and statistical methods, a tree diagram is used to determine the probability of getting specific results where the possibilities are nested.… …   Wikipedia

  • XL (programming language) — XL stands for eXtensible Language. It is a computer programming language designed to support concept programming.XL features programmer reconfigurable syntax and semantics. Compiler plug ins can be used to add new features to the language. A base …   Wikipedia

  • Non-terminal function — A non terminal function is a function (node) in a parse tree which is either a root or a branch in that tree. Background A parse tree is made up of nodes and branches. In the picture below the parse tree is the entire structure, starting from S… …   Wikipedia

  • Terminal function — A terminal function is a function (node) in a parse tree which is a leaf in that parse tree. Background A parse tree is made up of nodes and branches. In the picture below the parse tree is the entire structure, starting from S and ending in each …   Wikipedia

  • Context-free grammar — In formal language theory, a context free grammar (CFG) is a formal grammar in which every production rule is of the form V → w where V is a single nonterminal symbol, and w is a string of terminals and/or nonterminals (w can be empty). The… …   Wikipedia

  • Memoization — Not to be confused with Memorization. In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs.… …   Wikipedia

  • LL parser — An LL parser is a top down parser for a subset of the context free grammars. It parses the input from Left to right, and constructs a Leftmost derivation of the sentence (hence LL, compared with LR parser). The class of grammars which are… …   Wikipedia

  • CYK algorithm — The Cocke–Younger–Kasami (CYK) algorithm (alternatively called CKY) is a parsing algorithm for context free grammars. It employs bottom up parsing and dynamic programming. The standard version of CYK operates only on context free grammars given… …   Wikipedia

  • Bottom-up parsing — (also known as shift reduce parsing) is a strategy for analyzing unknown data relationships that attempts to identify the most fundamental units first, and then to infer higher order structures from them. It attempts to build trees upward toward… …   Wikipedia

  • Parser Grammar Engine — The Parser Grammar Engine (originally Parrot Grammar Engine) or PGE is a compiler and runtime for a Perl 6 rules for the Parrot virtual machine. [cite web | url=http://search.cpan.org/ ltoetsch/parrot 0.2.2/compilers/pge/README | title=Parrot… …   Wikipedia