Datalog


Datalog

Datalog is a query and rule language for deductive databases that syntactically is a subset of Prolog. Its origins date back to the beginning of logic programming, but it became prominent as a separate area around 1977 when Hervé Gallaire and Jack Minker organized a workshop on logic and databases.[1] David Maier is credited with coining the term Datalog.[2]

Contents

Features, limitations and extensions

Query evaluation with Datalog is based on first order logic, and is thus sound and complete. It can be done efficiently even for large databases. Query evaluation is usually done using bottom-up strategies.

In contrast to Prolog, it

  1. disallows complex terms as arguments of predicates, e.g. p(1, 2) is admissible but not p(f1(1), 2),
  2. imposes certain stratification restrictions on the use of negation and recursion, and
  3. only allows range-restricted variables, i.e. each variable in the conclusion of a rule must also appear in a not negated clause in the premise of this rule.

These rules make the set of all possible proofs finite, with the consequence that all datalog programs terminate (unlike Prolog programs). As a consequence, statements and predicates of a program can be stated in any order (unlike Prolog). Various methods have been proposed to efficiently perform queries, e.g. the Magic Sets algorithm,[3] or tabled logic programming.[4]

Datalog engines are behind specialised database systems such as Intellidimension's database for the semantic web. Moreover, some widely used database systems include ideas and algorithms developed for Datalog. For example, the SQL:1999 standard includes recursive queries, and the Magic Sets algorithm (initially developed for the faster evaluation of Datalog queries) is implemented in IBM's DB2.

Two extensions that have been made to Datalog include an extension to allow object-oriented programming and an extension to allow disjunctions as heads of clauses. Both extensions have major impacts on the definition of Datalog's semantics and on the implementation of a corresponding Datalog interpreter.

Example

Example Datalog program:

parent(bill,mary).
parent(mary,john).

These two lines define two facts, i.e. things that always hold. They can be intuitively understood as: the parent of mary is bill and the parent of john is mary.

ancestor(X,Y) :- parent(X,Y).
ancestor(X,Y) :- parent(X,Z),ancestor(Z,Y).

These two lines describe the rules that define the ancestor relationship. A rule consists of two main parts separated by the :- symbol. The part to the left of this symbol is the head, the part to the right the body of the rule. A rule is read (and can be intuitively understood) as <head> if it is known that <body>. Uppercase letters stand for variables. Hence in the example the first rule can be read as X is the ancestor of Y if it is known that X is the parent of Y. And the second rule as X is the ancestor of Y if it is known that X is the parent of some Z and Z is the ancestor of Y. The ordering of the clauses is irrelevant in Datalog in contrast to Prolog which depends on the ordering of clauses for computing the result of the query call.

Datalog distinguishes between extensional and intensional predicate symbols. While extensional predicate symbols are only defined by facts, intensional predicate symbols are defined only by rules. In the example above ancestor is an intensional predicate symbol, and parent is extensional. Predicates may also be defined by facts and rules and therefore neither be purely extensional nor intensional, but any datalog program can be rewritten into an equivalent program without such predicate symbols with duplicate roles.

?- ancestor(bill,X).

The query above asks for all ancestors of bill and would return mary and john when posed against a Datalog system containing the facts and rules described above.

Systems implementing Datalog

Most implementations of Datalog stem from university projects. Here is a short list of systems that are either based on Datalog or provide a Datalog interpreter:

Free software/Open source

  • DES, an open-source implementation of Datalog to be used for teaching Datalog in courses. (GNU GPL)
  • bddbddb, an implementation of Datalog done at Stanford University. It is mainly used to query Java bytecode including points-to analysis on large Java programs. (GNU LGPL)
  • IRIS, an open-source Datalog engine implemented in Java. IRIS extends Datalog with function symbols, built-in predicates, locally stratified or un-stratified logic programs (using the well-founded semantics), unsafe rules and XML schema data types. (GNU LGPL v2.1)
  • Oroboro, an open-source RDF processing framework implemented in Java, includes a query and inference language based on Datalog. (GNU GPL v3+)
  • Inter4QL, an open-source command-line interpreter of Datalog-like 4QL query language implemented in C++ for Windows, Mac OS X and Linux. Negation is allowed in heads and bodies of rules as well as in recursion. (GNU GPL v3)
  • Cascalog, a Clojure library for querying data stored on Hadoop clusters (GNU GPL v3)
  • Clojure Datalog, a contributed Clojure library implementing aspects of Datalog. (Eclipse Public License 1.0)
  • XSB, a logic programming and deductive database system for Unix and Windows. (GNU LGPL)
  • ConceptBase, a deductive and object-oriented database system based on a Datalog query evaluator. It is mainly used for conceptual modeling and metamodeling. (FreeBSD-style license)
  • Datalog, a lightweight deductive database system written in Lua. (GNU LGPL)
  • The Jena Semantic Web framework includes a Datalog implementation as part of its general purpose rule engine, which also forms the implementation of OWL and RDFS support [5]

Non-free software

  • .QL, a commercial object-oriented variant of Datalog created by Semmle. (patents pending)
  • SecPAL a security policy language developed by Microsoft Research[6]
  • DLV is a commercial Datalog extension that supports disjunctive head clauses.
  • Datalog for Racket, an implementation of Datalog for the Racket programming language.(unknown licence, possibly free software)
  • OverLog, an implementation of Datalog for overlay networks. (unknown licence)
  • LogicBlox, a commercial implementation of Datalog used for web-based retail planning and insurance applications
  • Meld, a commercial extension to Datalog for programming distributed systems
  • Intellidimension, provides several commercial implementations of datalog engine based on Semantic Web standards
  • StrixDB: a commercial RDF graph store, SPARQL compliant with Lua API and Datalog inference capabilities. Could be used as httpd (Apache HTTP Server) module or standalone. (although beta versions are under the Perl Artistic License 2.0)

See also

References

  1. ^ Hervé Gallaire, Jack Minker (Eds.): Logic and Data Bases, Symposium on Logic and Data Bases, Centre d'études et de recherches de Toulouse, 1977. Advances in Data Base Theory, Plenum Press, New York, 1978, ISBN 0-306-40060-X.
  2. ^ Serge Abiteboul, Richard Hull, Victor Vianu: Foundations of databases. p305 [1].
  3. ^ Bancilhon, "Magic sets and other strange ways to implement logic programs"
  4. ^ Frank Pfenning and Carsten Schuermann Twelf User's Guide
  5. ^ "Jena". http://jena.sourceforge.net/inference/#rules. 
  6. ^ "SecPAL". Microsoft Research. http://research.microsoft.com/projects/secpal. 

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Datalog — ist eine Datenbank Programmiersprache für deduktive Datenbanken, die Prolog syntaktisch und semantisch ähnelt. Sie geht zurück auf die Arbeit von Herve Gallaire und Jack Minker im Jahr 1978. Sie ist eigentlich nur von theoretischer Bedeutung, da… …   Deutsch Wikipedia

  • Datalog — Apparu en 1977 Auteur Hervé Gallaire et Jack Minker Paradigme Langage de requête Datalog est un langage de requête et de règles pour les bases d …   Wikipédia en Français

  • datalog — da|ta|log sb., en, er, erne …   Dansk ordbog

  • datalog — s ( en, er) …   Clue 9 Svensk Ordbok

  • Final Fantasy XIII — Final Fantasy XIII …   Wikipedia

  • Final Fantasy XIII — Обложка европейского издания Разработчик Square Enix Издатель Square Enix …   Википедия

  • Conjunctive query — In database theory, a conjunctive query is a restricted form of first order queries. A large part of queries issued on relational databases can be written as conjunctive queries, and large parts of other first order queries can be written as… …   Wikipedia

  • Konjunktive Anfrage — Konjunktive Anfragen sind eine Einschränkung von Anfragen der Prädikatenlogik und haben eine Reihe an wünschenswerten Eigenschaften, die in der Datenbanktheorie intensiv untersucht worden sind. Viele Anfragen an relationale Datenbanken und damit… …   Deutsch Wikipedia

  • Complexity of constraint satisfaction — The complexity of constraint satisfaction is the application of computational complexity theory on constraint satisfaction. It has mainly been studied for discriminating between tractable and intractable classes of constraint satisfaction… …   Wikipedia

  • SQL — This article is about the database language. For the airport with IATA code SQL, see San Carlos Airport. SQL Paradigm(s) Multi paradigm Appeared in 1974 Designed by Donald D. Chamberlin Raymond F. Boyce Developer …   Wikipedia