Lex programming tool

Lex programming tool

In computer science, lex is a program that generates lexical analyzers ("scanners" or "lexers"). Lex is commonly used with the yacc parser generator. Lex, originally written by Eric Schmidt and Mike Lesk, is the standard lexical analyzer generator on many Unix systems, and a tool exhibiting its behavior is specified as part of the POSIX standard.

Lex reads an input stream specifying the lexical analyzer and outputs source code implementing the lexer in the C programming language.

Though traditionally proprietary software, versions of Lex based on the original AT&T code are available as open source, as part of systems such as OpenSolaris and Plan 9 from Bell Labs. Another popular open source version of Lex is Flex, the "fast lexical analyzer".

tructure of a lex file

The structure of a lex file is intentionally similar to that of a yacc file; files are divided up into three sections, separated by lines that contain only two percent signs, as follows: "Definition section" %% "Rules section" %% "C code section"

*The definition section is the place to define macros and to import header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file.
*The rules section is the most important section; it associates patterns with C statements. Patterns are simply regular expressions. When the lexer sees some text in the input matching a given pattern, it executes the associated C code. This is the basis of how lex operates.
*The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section. In large programs it is more convenient to place this code in a separate file and link it in at compile time.

Example of a lex file

The following is an example lex file for the flex version of lex. It recognizes strings of numbers (integers) in the input, and simply prints them out.

/*** Definition section ***/

%{/* C code to be copied verbatim */
#include %}

/* This tells flex to read only one input file */%option noyywrap

%% /*** Rules section ***/

/* [0-9] + matches a string of one or more digits */ [0-9] + { /* yytext is a string containing the matched text. */ printf("Saw an integer: %s ", yytext); }

. { /* Ignore all other characters. */ }

%%/*** C Code section ***/

int main(void){ /* Call the lexer, then quit. */ yylex(); return 0;}

If this input is given to flex, it will be converted into a C file, lex.yy.c. This can be compiled into an executable which matches and outputs strings of integers. For example, given the input: abc123z.!&*2ghj6the program will print: Saw an integer: 123 Saw an integer: 2 Saw an integer: 6

Using Lex with Yacc

Lex and Yacc (a parser generator) are commonly used together. Yacc uses a formal grammar to parse an input stream, something which Lex cannot do using simple regular expressions (Lex is limited to simple finite state automata). However, Yacc cannot read from a simple input stream - it requires a series of tokens. Lex is often used to provide Yacc with these tokens.

Lex and make

The make utility can be used to maintain programs that involve lex. make assumes that a file that has an extension of .l is a lex source file. It knows how such a file must be processed to create an object file.

Suppose that a list of dependencies in a makefile contains a filename x.o, and there exists a file x.l. If x.l was modified later than the file x.o (or if x.o does not exist), then make will cause lex to be run on x.l, and then cause the object file x.o to be created from the resulting lex.yy.c. The make internal macro LFLAGS can be used to specify lex options to be invoked automatically by make. [citation
journal=The Open Group Base Specifications Issue 6, IEEE Std 1003.1, 2004 Edition
publisher=The IEEE and The Open Group


ee also

*Flex lexical analyser
*List of C# lexer generators

Wikimedia Foundation. 2010.

Look at other dictionaries:

  • Lex — or LEX may refer to:* Written law: ** Legislation ** Statute ** Statutory law ** Act of Parliament ** Act of Congress *Lex programming tool * Lex Records * Lexical item, in formal languages * Companies: ** Lex Vehicle Leasing ** Lex building,… …   Wikipedia

  • List of programming languages by category — Programming language lists Alphabetical Categorical Chronological Generational This is a list of programming languages grouped by category. Some languages are listed in multiple categories. Contents …   Wikipedia

  • Non-English-based programming languages — are computer programming languages that, unlike better known programming languages, do not use keywords taken from, or inspired by, the English vocabulary. Contents 1 Prevalence of English based programming languages 2 International programming… …   Wikipedia

  • Racket (programming language) — Racket Paradigm(s) Multi paradigm: Functional, Procedural, Modular, Object oriented, Reflective, Meta Appeared in 1994 Developer …   Wikipedia

  • Lexical analysis — In computer science, lexical analysis is the process of converting a sequence of characters into a sequence of tokens. Programs performing lexical analysis are called lexical analyzers or lexers. A lexer is often organized as separate scanner and …   Wikipedia

  • FleXML — is an XML transformation language originally developed by Kristofer Rose. It allows a programmer to specify actions in C programming language or C++, and associate those actions with element definitions in an XML DTD. It is similar in philosophy… …   Wikipedia

  • Mike Lesk — Michael E. Lesk is a computer programmer. In the 1960s, Michael Lesk worked for the SMART Information Retrieval System project, wrote much of its retrieval code and did many of the retrieval experiments, as well as obtaining a PhD in Chemical… …   Wikipedia

  • Linux From Scratch — Company / developer Gerard Beekmans et al. OS family Unix like Working state Current Source model Open source / Free Software Initial release …   Wikipedia

  • Regular expression — In computing, a regular expression provides a concise and flexible means for matching (specifying and recognizing) strings of text, such as particular characters, words, or patterns of characters. Abbreviations for regular expression include… …   Wikipedia

  • Perl — This article is about the programming language. For other uses, see Perl (disambiguation). Perl Paradig …   Wikipedia