Tree (data structure)


Tree (data structure)
A simple unordered tree; in this diagram, the node labeled 7 has two children, labeled 2 and 6, and one parent, labeled 2. The root node, at the top, has no parent.

In computer science, a tree is a widely-used data structure that emulates a hierarchical tree structure with a set of linked nodes.

Mathematically, it is an ordered directed tree, more specifically an arborescence: an acyclic connected graph where each node has zero or more children nodes and at most one parent node. Furthermore, the children of each node have a specific order.

Contents

Terminology

A node is a structure which may contain a value, a condition, or represent a separate data structure (which could be a tree of its own). Each node in a tree has zero or more child nodes, which are below it in the tree (by convention, trees are drawn growing downwards). A node that has a child is called the child's parent node (or ancestor node, or superior). A node has at most one parent.

An internal node or inner node is any node of a tree that has child nodes. Similarly, an external node (also known as an outer node, leaf node, or terminal node), is any node that does not have child nodes.

The topmost node in a tree is called the root node. Being the topmost node, the root node will not have a parent. It is the node at which operations on the tree commonly begin (although some algorithms begin with the leaf nodes and work up ending at the root). All other nodes can be reached from it by following edges or links. (In the formal definition, each such path is also unique). In diagrams, it is typically drawn at the top. In some trees, such as heaps, the root node has special properties. Every node in a tree can be seen as the root node of the subtree rooted at that node. A free tree is a tree that is not rooted.

The height of a node is the length of the longest downward path to a leaf from that node. The height of the root is the height of the tree. The depth of a node is the length of the path to its root (i.e., its root path). This is commonly needed in the manipulation of the various self balancing trees, AVL Trees in particular. Conventionally, the value −1 corresponds to a subtree with no nodes, whereas zero corresponds to a subtree with one node.

A subtree of a tree T is a tree consisting of a node in T and all of its descendants in T. (This is different from the formal definition of subtree used in graph theory.[1]) The subtree corresponding to the root node is the entire tree; the subtree corresponding to any other node is called a proper subtree (in analogy to the term proper subset).

Representations

There are many different ways to represent trees; common representations represent the nodes as dynamically allocated records with pointers to their children, their parents, or both, or as items in an array, with relationships between them determined by their positions in the array (e.g., binary heap).

Trees and graphs

The tree data structure can be generalized to represent directed graphs by removing the constraints that a node may have at most one parent, and that no cycles are allowed. Edges are still abstractly considered as pairs of nodes, however, the terms parent and child are usually replaced by different terminology (for example, source and target). Different implementation strategies exist, for example adjacency lists.

Relationship with trees in graph theory

In graph theory, a tree is a connected acyclic graph; unless stated otherwise, trees and graphs are undirected. There is no one-to-one correspondence between such trees and trees as data structure. We can take an arbitrary undirected tree, arbitrarily pick one of its vertices as the root, make all its edges directed by making them point away from the root node - producing an arborescence - and assign an order to all the nodes. The result corresponds to a tree data structure. Picking a different root or different ordering produces a different one.

Traversal methods

Stepping through the items of a tree, by means of the connections between parents and children, is called walking the tree, and the action is a walk of the tree. Often, an operation might be performed when a pointer arrives at a particular node. A walk in which each parent node is traversed before its children is called a pre-order walk; a walk in which the children are traversed before their respective parents are traversed is called a post-order walk; a walk in which a node's left subtree, then the node itself, and then finally its right subtree are traversed is called an in-order traversal. (This last scenario, referring to exactly two subtrees, a left subtree and a right subtree, assumes specifically a binary tree.) Here inorder is like infix expression,postorder is like postfix expression and preorder is like prefix expression.

Common operations

  • Enumerating all the items
  • Enumerating a section of a tree
  • Searching for an item
  • Adding a new item at a certain position on the tree
  • Deleting an item
  • Removing a whole section of a tree (called pruning)
  • Adding a whole section to a tree (called grafting)
  • Finding the root for any node

Common uses

See also

Other trees

References

  1. ^ Eric W. Weisstein "Subtree." From MathWorld—A Wolfram Web Resource. http://mathworld.wolfram.com/Subtree.html
Notes

External links


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Data structure augmentation — In computer science augmenting a data structure means modifying it in some way to create a new one by storing new information in it. Augmenting data structures is quite common in software developing. This is mainly due to specific requirements of …   Wikipedia

  • Data structure — In computer science, a data structure is a particular way of storing and organizing data in a computer so that it can be used efficiently.[1][2] Different kinds of data structures are suited to different kinds of applications, and some are highly …   Wikipedia

  • Persistent data structure — In computing, a persistent data structure is a data structure which always preserves the previous version of itself when it is modified; such data structures are effectively immutable, as their operations do not (visibly) update the structure in… …   Wikipedia

  • Heap (data structure) — This article is about the programming data structure. For the dynamic memory area, see Dynamic memory allocation. Example of a complete binary max heap In computer science, a heap is a specialized tree based data structure that satisfies the heap …   Wikipedia

  • Container (data structure) — For the abstract notion of containers in Type theory, see Container (Type theory). In computer science, a container is a class, a data structure[1][2], or an abstract data type (ADT) whose instances are collections of other objects. In other… …   Wikipedia

  • Search data structure — In computer science, a search data structure is any data structure that allows the efficient retrieval of specific items from a set of items, such as a specific record from a database. The simplest, most general, and least efficient search… …   Wikipedia

  • Zipper (data structure) — Zipper is a purely functional data structure used in functional programming to solve some problems in a way using notions like “context” and “hole”. It is related to the generalization of notion “derivative” (for types). The zipper was described… …   Wikipedia

  • Compressed data structure — The term compressed data structure arises in the computer science subfields of algorithms, data structures, and theoretical computer science. It refers to a data structure whose operations are roughly as fast as those of a conventional data… …   Wikipedia

  • Succinct data structure — In computer science, a succinct data structure for a given data type is a representation of the underlying combinatorial object that uses an amount of space “close” to the information theoretic lower bound together with efficient algorithms for… …   Wikipedia

  • Graph (data structure) — In computer science, a graph is a kind of data structure, specifically an abstract data type (ADT), that consists of a set of nodes (also called vertices) and a set of edges that establish relationships (connections) between the nodes. The graph… …   Wikipedia