Higher-order abstract syntax

Higher-order abstract syntax

In computer science, higher-order abstract syntax (abbreviated HOAS) is a technique for the representation of abstract syntax trees for languages with variable binders.

Relation to first-order abstract syntax

An abstract syntax tree is "abstract" because it is a mathematical object that has certain structure by its very nature. For instance, in "first-order abstract syntax" ("FOAS") trees, as commonly used in compilers, the tree structure implies the subexpression relation, meaning that no parentheses are required to disambiguate programs (as they are in the concrete syntax). HOAS exposes additional structure: the relationship between variables and their binding sites. In FOAS representations, a variable is typically represented with an identifier, with the relation between binding site and use being indicated by using the "same" identifier. With HOAS, there is no name for the variable; each use of the variable refers directly to the binding site.

There are a number of reasons why this technique is useful. First, it makes the binding structure of a program explicit; just as there is no need to explain operator precedence in a FOAS representation, there is no need to have the rules of binding and scope at hand in order to interpret a HOAS representation. Second, programs that are
alpha-equivalent (differing only in the names of bound variables) have identical representations in HOAS, which can make equivalence checking more efficient.


One mathematical object that could be used to implement HOAS is a graph where variables are associated with their binding sites via edges. Another popular way to implement HOAS (in, for example, compilers) is with de Bruijn indices.

Use in logical frameworks

In the domain of logical frameworks, the term higher-order abstract syntax is usually used to refer to a specific representation that uses the binders of the meta-language in order to encode the binding structure of the object language.

For instance, the logical framework LF has a λ-construct, which has arrow(→) type. A first-order encoding of an object language construct let would be (using Twelfsyntax):

exp : type. var : type. v : var -> exp. let : exp -> var -> exp -> exp.

Here, exp is the family of object language expressions. The family var is the representation of variables (implemented perhaps as natural numbers, which is not shown); the constant v witnesses the fact that variables are expressions. The constant let is an expression that takes three arguments: an expression (that is being bound), a variable (that it is bound to) and another expression (that the variable is bound within).

The canonical HOAS representation of the same object language would be:

exp : type. let : exp -> (exp -> exp) -> exp.

In this representation, object level variables do not appear explicitly. The constant let takes an expression (that is being bound) and a meta-level function expexp (the body of the let). This function is the "higher-order" part: an expression with a free variable isrepresented as an expression with "holes" that are filled in by the meta-level function when applied. As a concrete example, we would construct the object level expression

let x = 1 + 2 in x + 3

(assuming the natural constructors for numbers and addition) using the HOAS signature above as

let (plus 1 2) ( [y] plus y 3)

where [y] e is Twelf's syntax for the function lambda y.e.

This specific representation has advantages beyond the ones above: for one, by reusing the meta-level notion of binding, the encoding enjoys properties such as type-preserving "substitution" without the need to define/prove them. In this way using HOAS can drastically reduce the amount of boilerplate code having to do with binding in an encoding.

Because this technique reuses the mechanism of the meta-language to encode a concept in the object language, it is generally only applicable when the meta-language and object-language notions of binding coincide. This is often the case, but not always: for instance, it is unlikely that a HOAS encoding of dynamic scope such as in Lisp would be possible in a statically-scopedlanguage like LF.


* cite conference
author = Dale Miller and Gopalan Nadathur
year = 1987
title = A Logic Programming Approach to Manipulating Formulas and Programs
url = http://www.lix.polytechnique.fr/Labo/Dale.Miller/papers/slp87.pdf
booktitle = IEEE Symposium on Logic Programming
pages = 379-388

* cite conference
author = Frank Pfenning, Conal Elliott
year = 1988
title = Higher-order abstract syntax
url = http://www.cs.cmu.edu/~fp/papers/pldi88.pdf
booktitle = Proceedings of the ACM SIGPLAN PLDI '88
pages = 199–208
doi = 10.1145/53990.54010
id = ISBN 0-89791-269-1

* cite journal
author = J. Despeyroux, A. Felty, A. Hirschowitz
year = 1995
title = Higher-Order Abstract Syntax in Coq
url = http://www.site.uottawa.ca/~afelty/dist/tlca95.ps
journal = Lecture Notes in Computer Science
volume = 902
pages = 124–138
id = ISBN 3-540-59048-X

* cite conference
author = Martin Hofmann
year = 1999
title = Semantical analysis of higher-order abstract syntax
url = http://www.tcs.informatik.uni-muenchen.de/~mhofmann/lics99hoas.ps.gz
booktitle = 14th Annual IEEE Symposium on Logic in Computer Science
pages = 204
id = ISBN 0-7695-0158-3

* cite conference
author = Dale Miller
year = 2000
title = Abstract Syntax for Variable Binders: An Overview
url = http://www.lix.polytechnique.fr/Labo/Dale.Miller/papers/ltrees.pdf
booktitle = Computational Logic - {CL} 2000
pages = 239-253

* cite conference
author = Eli Barzilay, Stuart Allen
title = Reflecting Higher-Order Abstract Syntax in Nuprl
url = http://www.barzilay.org/misc/hoas-paper.pdf
booktitle = Theorem Proving in Higher-Order Logics 2002
year = 2002
pages = 23–32
id = ISBN 3-540-44039-9

* cite conference
author = Eli Barzilay
year = 2006
title = A Self-Hosting Evaluator using HOAS
url = http://scheme2006.cs.uchicago.edu/15-barzilay.pdf
booktitle = ICFP Workshop on Scheme and Functional Programming 2006

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Abstract syntax — The abstract syntax of data is its structure described as a data type (possibly, but not necessarily, an abstract data type), independent of any particular representation or encoding. To be implemented either for computation or communications, a… …   Wikipedia

  • Higher order grammar — (HOG) is a grammar theory based on higher order logic. It can be viewed simultaneously as generative enumerative (like Categorial Grammar and Principles Parameters) or model theoretic (like Head Driven Phrase Structure Grammar or Lexical… …   Wikipedia

  • First-order logic — is a formal logical system used in mathematics, philosophy, linguistics, and computer science. It goes by many names, including: first order predicate calculus, the lower predicate calculus, quantification theory, and predicate logic (a less… …   Wikipedia

  • Second-order logic — In logic and mathematics second order logic is an extension of first order logic, which itself is an extension of propositional logic.[1] Second order logic is in turn extended by higher order logic and type theory. First order logic uses only… …   Wikipedia

  • Musical syntax — When analysing the regularities and structure of music as well as the processing of music in the brain, certain findings lead to the question, if music is based on a syntax which could be compared with linguistic syntax. To get closer to this… …   Wikipedia

  • Vietnamese syntax — Vietnamese, like many languages in Southeast Asia, is an analytic (or isolating) language. [Comparison note: As such its grammar relies on word order and sentence structure rather than morphology (in which word changes through inflection).… …   Wikipedia

  • Nominal terms (computer science) — Nominal terms are a metalanguage for embedding object languages with binding constructs into. Intuitively, they may be seen as an extension of first order terms with support for name binding. Consequently, the native notion of equality between… …   Wikipedia

  • ΛProlog — λProlog, also written lambda Prolog, is a logic programming language featuring polymorphic typing, modular programming, and higher order programming. These extensions to Prolog are derived from the higher order hereditary Harrop formulas used to… …   Wikipedia

  • TreeDL — Tree Description Language (TreeDL) is a computer language for description of strictly typed tree data structures and operations on them. The main use of TreeDL is in the development of language oriented tools (compilers, translators, etc) for the …   Wikipedia

  • Nominal techniques — are a range of techniques, based on nominal sets, for handling names and binding, e.g. in abstract syntax. Research into nominal sets gave rise to nominal terms, a metalanguage for embedding object languages with name binding constructs into. See …   Wikipedia