Semantics encoding


Semantics encoding

A semantics encoding is a "translation" between formal languages.

For programmers, the most familiar form of encoding is the compilation of a programming language into machine code or byte-code. Conversion between document formats are also forms of encoding. Compilation of TeX or LaTeX documents to PostScript are also commonly encountered encoding processes. Some high-level preprocessors such as Objective Caml's Camlp4 or Apple Computer's WorldScript also involve encoding of a programming language into another.

Definition

Formally, an encoding of a language A into language B is a mapping of all terms of A into B.

If there is a 'satisfactory' encoding of A into B, B is considered 'at least as powerful' (or 'at least as expressive') as A.

Properties of encodings

This informal notion of translation is not sufficient to help determine expressivity of languages, as it permits trivial encodings such as mapping all elements of A to the same element of B. Therefore, it is necessary to determine the definition of a "good enough" encoding. This notion varies with the application.

Commonly, an encoding [cdot] : A longrightarrow B is expected to preserve a number of properties.

Preservation of compositions

; soundness : For every n-ary operator op_A of A, there exists an n-ary operator op_B of B such that:forall T_A^1,T_A^2,dots,T_A^n, [op_A(T_A^1,T_A^2,cdots,T_A^n)] = op_B( [T_A^1] , [T_A^2] ,cdots, [T_A^n] ); completeness : For every n-ary operator op_A of A, there exists an n-ary operator op_B of B such that:forall T_B^1,T_B^2,dots,T_B^n, exists T_A^1,dots,T_A^n, op_B(T_B^1,cdots,T_B^N) = [op_A(T_A^1,T_A^2,cdots,T_A^n)]

(Note: as far as the author is aware of, this criterion of completeness is never used.)

Preservation of compositions is useful insofar as it guarantees that components can be examined either separately or together without "breaking" any interesting property. In particular, in the case of compilations, this soundness guarantees the possibility of proceeding with separate compilation of components, while completeness guarantees the possibility of de-compilation.

Preservation of reductions

This assumes the existence of a notion of reduction on both language A and language B. Typically, in the case of a programming language, reduction is the relation which models the execution of a program.

We write longrightarrow for one step of reduction and longrightarrow^* for any number of steps of reduction.

; soundness : For every terms T_A^1, T_A^2 of language A, if T_A^1 longrightarrow^* T_A^2 then [T_A^1] longrightarrow^* [T_A^2] .; completeness : For every term T_A^1 of language A and every terms T_B^2 of language B, if [T_A^1] longrightarrow^* T_B^2 then there exists some T_A^2 such that T_B^2 = [T_A^2] .

This preservation guarantees that both languages behave the same way. Soundness guarantees that all possible behaviours are preserved while completeness guarantees that no behaviour is added by the encoding. In particular, in the case of compilation of a programming language, soundness and completeness together mean that the compiled program behaves accordingly to the high-level semantics of the programming language.

Preservation of termination

This also assumes the existence of a notion of reduction on both language A and language B.

; soundness : for any term T_A, if all reductions of T_A converge, then all reductions of [T_A] converge.; completeness : for any term [T_A] , if all reductions of [T_A] converge, then all reductions of T_A converge.

In the case of compilation of a programming language, soundness guarantees that the compilation does not introduce non-termination such as endless loops or endless recursions. The completeness property is useful when language B is used to study or test a program written in language A, possibly by extracting key parts of the code: if this study or test proves that the program terminates in B, then it also terminates in A.

Preservation of observations

This assumes the existence of a notion of observation on both language A and language B. In programming languages, typical observables are results of inputs and outputs, by opposition to pure computation. In a description language such as HTML, a typical observable is the result of page rendering.

; soundness : for every observable obs_A on terms of A, there exists an observable obs_B of terms of B such that for any term T_A with observable obs_A, [T_A] has observable obs_B.; completeness : for every observable obs_A on terms of A, there exists an observable obs_B on terms of B such that for any term [T_A] with observable obs_B, T_A has observable obs_A.

Preservation of simulations

This assumes the existence of notion of simulation on both language A and language B. In a programming languages, a program simulates another if it can perform all the same (observable) tasks and possibly some others. Simulations are used typically to describe compile-time optimizations.

; soundness : for every terms T_A^1, T_A^2, if T_A^2 simulates T_A^1 then [T_A^2] simulates [T_A^1] .; completeness : for every terms T_A^1, T_A^2, if [T_A^2] simulates [T_A^1] then T_A^2 simulates T_A^1.

Preservation of simulations is a much stronger property than preservation of observations, which it entails. In turn, it is weaker than a property of preservation of bisimulations. As in previous cases, soundness is important for compilation, while completeness is useful for testing or proving properties.

Preservation of equivalences

This assumes the existence of a notion of equivalence on both language A and language B. Typically, this can be a notion of equality of structured data or a notion of syntactically different yet semantically identical programs, such as structural congruence or structural equivalence.

; soundness : if two terms T_A^1 and T_A^2 are equivalent in A, then [T_A^1] and [T_A^2] are equivalent in B.; completeness : if two terms [T_A^1] and [T_A^2] are equivalent in B, then T_A^1 and T_A^2 are equivalent in A.

Preservation of distribution

This assumes the existence of a notion of distribution on both language A and language B. Typically, for compilation of distributed programs written in Acute, JoCaml or E, this means distribution of processes and data among several computers or CPUs.

; soundness : if a term T_A is the composition of two agents T_A^1~|~T_A^2 then [T_B] must be the composition of two agents [T_A^1] ~|~ [T_A^2] .; completeness : if a term [T_A] is the composition of two agents T_B^1~|~T_B^2 then T_B must be the composition of two agents T_A^1~|~T_A^2 such that [T_A^1] =T_B^1 and [T_A^2] =T_B^2.

See also

* Bisimulation
* Compiler
* Semantics

External links

* [http://catamaran.labs.cs.uu.nl/twiki/pt/bin/view/Transform/WebChanges|The Program Transformation Wiki]


Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Encoding — is the process of transforming information from one format into another. The opposite operation is called decoding. There are a number of more specific meanings that apply in certain contexts:*Encoding (in cognition) is a basic perceptual process …   Wikipedia

  • semantics — semanticist /si man teuh sist/, semantician /see man tish euhn/, n. /si man tiks/, n. (used with a sing. v.) 1. Ling. a. the study of meaning. b. the study of linguistic development by classifying and examining changes in meaning and form. 2.… …   Universalium

  • TRON (encoding) — TRON is a multi byte character encoding. It is similar to Unicode but does not use Unicode s Han unification process: each character from each CJK character set is encoded separately, including archaic and historical equivalents of modern… …   Wikipedia

  • Code — redirects here. CODE may also refer to Cultural Olympiad Digital Edition. Decoded redirects here. For the television show, see Brad Meltzer s Decoded. For code (computer programming), see source code. For other uses, see Code (disambiguation).… …   Wikipedia

  • Compiler — This article is about the computing term. For the anime, see Compiler (anime). A diagram of the operation of a typical multi language, multi target compiler A compiler is a computer program (or set of programs) that transforms source code written …   Wikipedia

  • π-calculus — In theoretical computer science, the π calculus (or pi calculus) is a process calculus originally developed by Robin Milner, Joachim Parrow and David Walker as a continuation of work on the process calculus CCS (Calculus of Communicating Systems) …   Wikipedia

  • Pi-calculus — In theoretical computer science, the pi calculus is a process calculus originally developed by Robin Milner, Joachim Parrow and David Walker as a continuation of work on the process calculus CCS (Calculus of Communicating Systems). The aim of the …   Wikipedia

  • Language — This article is about the properties of language in general. For other uses, see Language (disambiguation). Cuneiform is one of the first known forms of written language, but spoken language is believed to predate writing by tens of thousands of… …   Wikipedia

  • XML — Infobox file format name = Extensible Markup Language icon = logo = extension = .xml mime = application/xml, text/xml (deprecated) type code = uniform type = public.xml magic = owner = World Wide Web Consortium genre = Markup language container… …   Wikipedia

  • Memory errors — Memory gaps and errors refer to the incorrect recall, or complete loss, of information in the memory system for a specific detail and/or event. Memory errors may include remembering events that never occurred, or remembering them differently from …   Wikipedia