Set (computer science)

Set (computer science)

In computer science, a set is a collection (container) of certain values, without any particular order, and no repeated values. It corresponds with a finite set in mathematics. Disregarding sequence, and the fact that there are no repeated values, it is the same as a list. A set can be seen as an associative array (partial mapping) in which the value of each key-value pair is ignored.


Sets can be implemented using various data structures. Ideal set data structures make it efficient to check if an object is in the set, as well as enabling other useful operations such as iterating through all the objects in the set, performing a union or intersection of two sets, or taking the complement of a set in some limited domain. Any associative array data structure can be used to implement a set by letting the set of keys be the elements of the set and ignoring the values. Because of the similarity to associative arrays, sets are commonly implemented in the same ways, namely, a self-balancing binary search tree for sorted sets (which has O(log n) for most operations), or a hash table for unsorted sets (which has O(1) average-case, but O(n) worst-case, for most operations). A sorted linear hash table [ Citation | title=Sorted Linear Hash Table | first1=Thomas | last1=Wang |url= | year=1997 ] may be used to provide deterministically ordered sets.

Other popular methods include arrays. In particular a subset of the integers 1.."n" can be implemented efficiently as an "n"-bit bit array, which also support very efficient union and intersection operations. A Bloom map implements a set probabilistically, using a very compact representation but risking a small chance of false positives on queries.

However, very few of these data structures support set operations such as union or intersection efficiently. For these operations, more specialized set data structures exist.

Language support

One of the earliest languages to support sets was Pascal; many languages now include it, whether in the core language or in a standard library. The Java programming language offers the Javadoc:SE|java/util|Set interface to support sets (with the Javadoc:SE|java/util|HashSet class implementing it using a hash table), and the Javadoc:SE|java/util|SortedSet sub-interface to support sorted sets (with the Javadoc:SE|java/util|TreeSet class implementing it using a binary search tree). In C++, STL provides the "set" template class, which implements a sorted set using a binary search tree; and SGI's STL provides the "hash_set" class, which implements a set using a hash table. Python has a built-in set type, but no set literal.


A variation of the set is the multiset or bag, which is the same as a set data structures, but allows repeated values. Formally, a multiset can be thought of as an associative array that maps unique elements to positive integers, indicating the mulplicity of the element, although actual implementation may vary. C++'s Standard Template Library provides the "multiset" class for the sorted multiset, and SGI's STL provides the "hash_multiset" class, which implements a set using a hash table. Apache Commons Collections provides [ Bag] and SortedBag interface for Java; along with implementing classes like HashBag and TreeBag, analogous to similarly-named Set implementations.


ee also

*Bloom map
*Disjoint-set data structure
*Standard Template Library - contains a C++ set class

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • computer science — computer scientist. the science that deals with the theory and methods of processing information in digital computers, the design of computer hardware and software, and the applications of computers. [1970 75] * * * Study of computers, their… …   Universalium

  • Computer science — or computing science (abbreviated CS) is the study of the theoretical foundations of information and computation and of practical techniques for their implementation and application in computer systems. Computer scientists invent algorithmic… …   Wikipedia

  • Computer Science House — (CSH) is a special interest house founded in 1976 at the Rochester Institute of Technology, made up of a group of students who share an interest in computers, community and having fun. Despite its name, students from all majors are allowed to… …   Wikipedia

  • computer science — noun the branch of engineering science that studies (with the aid of computers) computable processes and structures • Syn: ↑computing • Topics: ↑computer, ↑computing machine, ↑computing device, ↑data processor, ↑electronic computer, ↑ …   Useful english dictionary

  • COMPUTER SCIENCE — The term Computer Science encompasses three different types of research areas: computability, efficiency, and methodology. General Introduction Computability deals with the question of what is mechanically computable. The most natural way to… …   Encyclopedia of Judaism

  • String (computer science) — In formal languages, which are used in mathematical logic and theoretical computer science, a string is a finite sequence of symbols that are chosen from a set or alphabet. In computer programming, a string is traditionally a sequence of… …   Wikipedia

  • Abstraction (computer science) — In computer science, abstraction is the process by which data and programs are defined with a representation similar to its pictorial meaning as rooted in the more complex realm of human life and language with their higher need of summarization… …   Wikipedia

  • Unification (computer science) — Unification, in computer science and logic, is an algorithmic process by which one attempts to solve the satisfiability problem. The goal of unification is to find a substitution which demonstrates that two seemingly different terms are in fact… …   Wikipedia

  • Recursion (computer science) — Recursion in computer science is a way of thinking about and solving problems. It is, in fact, one of the central ideas of computer science. [cite book last = Epp first = Susanna title = Discrete Mathematics with Applications year=1995… …   Wikipedia

  • AP Computer Science — Advanced Placement Computer Science (also called APCS) is the name of two distinct Advanced Placement courses and examinations offered by the College Board to high school students as an opportunity to earn college credit for a college level… …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”

We are using cookies for the best presentation of our site. Continuing to use this site, you agree with this.