Bootstrapping (compilers)

Bootstrapping (compilers)

In computer science, bootstrapping is the process of writing a compiler (or assembler) in the target programming language which it is intended to compile. Applying this technique leads to a self-hosting compiler.

A large proportion of programming languages are bootstrapped, including BASIC, C, Pascal, Factor, Haskell, Modula-2, Oberon, OCaml, Common Lisp, Scheme, Python and more.

Contents

Advantages

Bootstrapping a compiler has the following advantages:[1] [2]

  • it is a non-trivial test of the language being compiled;
  • compiler developers only need to know the language being compiled;
  • compiler development can be done in the higher level language being compiled;
  • improvements to the compiler's back-end improve not only general purpose programs but also the compiler itself; and
  • it is a comprehensive consistency check as it should be able to reproduce its own object code.

The chicken and egg problem

If one needs a compiler for language X to obtain a compiler for language X (which is written in language X), how did the first compiler get written? Possible methods to solving this chicken or the egg problem include:

  • Implementing an interpreter or compiler for language X in language Y. Niklaus Wirth reported that he wrote the first Pascal compiler in Fortran.
  • Another interpreter or compiler for X has already been written in another language Y; this is how Scheme is often bootstrapped.
  • Earlier versions of the compiler were written in a subset of X for which there existed some other compiler; this is how some supersets of Java, Haskell, and the initial Free Pascal compiler are bootstrapped.
  • The compiler for X is cross compiled from another architecture where there exists a compiler for X; this is how compilers for C are usually ported to other platforms. Also this is the method used for Free Pascal after the initial bootstrap.
  • Writing the compiler in X; then hand-compiling it from source (most likely in a non-optimized way) and running that on the code to get an optimized compiler. Donald Knuth used this for his WEB literate programming system.

Methods for distributing compilers in source code include providing a portable bytecode version of the compiler, so as to bootstrap the process of compiling the compiler with itself.

Such methods are also one way of detecting or avoiding (or both) the potential problem pointed out in Reflections on Trusting Trust.

The T-diagram is a notation used to explain these compiler bootstrap techniques.[2]

In some cases, the most convenient way to get a complicated compiler running on a system that has little or no software on it involves a series of ever more sophisticated assemblers and compilers.[3]

History

The first language to provide such a bootstrap was NELIAC. The first commercial language to do so was PL/I.

The first self-hosting compiler (excluding assemblers) was written for Lisp by Hart and Levin at MIT in 1962. They wrote a Lisp compiler in Lisp, testing it inside an existing Lisp interpreter. Once they had improved the compiler to the point where it could compile its own source code, it was self-hosting.[4]

The compiler as it exists on the standard compiler tape is a machine language program that was obtained by having the S-expression definition of the compiler work on itself through the interpreter. (AI Memo 39)[4]

This technique is only possible when an interpreter already exists for the very same language that is to be compiled. It borrows directly from the notion of running a program on itself as input, which is also used in various proofs in theoretical computer science, such as the proof that the halting problem is undecidable.

See also

References

  1. ^ Compilers and Compiler Generators: An Introduction With C++. Patrick D. Terry 1997. International Thomson Computer Press. ISBN 1850322988
  2. ^ a b "Compiler Construction and Bootstrapping" by P.D.Terry 2000. HTML. PDF. HTML.
  3. ^ "Bootstrapping a simple compiler from nothing" by Edmund GRIMLEY EVANS 2001
  4. ^ a b Tim Hart and Mike Levin. "AI Memo 39-The new compiler". ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-039.pdf. Retrieved 2008-05-23. 

Wikimedia Foundation. 2010.

Игры ⚽ Нужно сделать НИР?

Look at other dictionaries:

  • Bootstrapping — This is the history of bootstrapping or booting which began in the 1880s as a leather strap and evolved into a group of metaphors that share a common meaning, a self sustaining process that proceeds without external help. traps for leather… …   Wikipedia

  • Bootstrapping (Programmierung) — Bootstrapping (auch Ureingabe[1]) bezeichnet in der Programmierung den Prozess, mit einfachen Entwicklungswerkzeugen mächtigere zu programmieren. Die einfachste Umgebung ist etwa ein sehr grundlegender Texteditor und ein Assembler. Mit diesen… …   Deutsch Wikipedia

  • Bootstrapping (computing) — In computing, bootstrapping ( to pull oneself up by one s bootstraps ) refers to techniques that allow a simple system to activate a more complicated system. A common scenario is the start up process of a computer system, where a small program,… …   Wikipedia

  • Self-hosting — The term: Self hosting was coined to refer to the use of a computer program as part of the toolchain or operating system that produces new versions of that same program mdash;for example, a compiler that can compile its own source code. Self… …   Wikipedia

  • Transformation language — A transformation language is a computer language designed to transform some input text in a certain formal language into a modified output text that meets some specific goal.Program transformation systems such as Stratego/XT, TXL, DMS, and… …   Wikipedia

  • PyPy — Infobox Software name = PyPy caption = developer = programming language = Python latest release version = 1.0 latest release date = March 27, 2007 operating system = Cross platform genre = Python interpreter and compiler toolchain license = MIT… …   Wikipedia

  • Compiler — This article is about the computing term. For the anime, see Compiler (anime). A diagram of the operation of a typical multi language, multi target compiler A compiler is a computer program (or set of programs) that transforms source code written …   Wikipedia

  • Cross compiler — A cross compiler is a compiler capable of creating executable code for a platform other than the one on which the compiler is run. Cross compiler tools are used to generate executables for embedded system or multiple platforms. It is used to… …   Wikipedia

  • IP Pascal — is an implementation of the Pascal programming language using the IP portability platform, a multiple machine, operating system and language implementation system. Overview IP Pascal implements the language Pascaline (named after Blaise Pascal s… …   Wikipedia

  • GNU Compiler Collection — Cc1 redirects here. For other uses of CC1 or CC 1, see CC1 (disambiguation). GNU Compiler Collection Developer(s) GNU Project Initial release May 23, 1987 ( …   Wikipedia

Share the article and excerpts

Direct link
Do a right-click on the link above
and select “Copy Link”