Lex (software)

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

Lex is a computer program that generates lexical analyzers ("scanners" or "lexers").[1][2]

Lex is commonly used with the yacc parser generator. Lex, originally written by Mike Lesk and Eric Schmidt[3] and described in 1975,[4][5] is the standard lexical analyzer generator on many Unix systems, and an equivalent tool is specified as part of the POSIX standard.[citation needed]

Lex reads an input stream specifying the lexical analyzer and outputs source code implementing the lexer in the C programming language.

Open source[edit]

Though originally distributed as proprietary software, some versions of Lex are now open source. Open source versions of Lex, based on the original AT&T code are now distributed as open source systems such as OpenSolaris and Plan 9 from Bell Labs.[clarification needed] One popular open source version of Lex, called flex, or the "fast lexical analyzer", is not derived from proprietary coding.

Structure of a Lex file[edit]

The structure of a Lex file is intentionally similar to that of a yacc file; files are divided into three sections, separated by lines that contain only two percent signs, as follows

  • The definition section defines macros and imports header files written in C. It is also possible to write any C code here, which will be copied verbatim into the generated source file.
  • The rules section associates regular expression patterns with C statements. When the lexer sees text in the input matching a given pattern, it will execute the associated C code.
  • The C code section contains C statements and functions that are copied verbatim to the generated source file. These statements presumably contain code called by the rules in the rules section; in large programs it is more convenient to place this code in a separate file linked in at compile time.

Example of a Lex file[edit]

The following is an example Lex file for the flex version of Lex, it recognizes strings of numbers (positive integers) in the input, and simply prints them out.

/*** Definition section ***/

/* C code to be copied verbatim */
#include <stdio.h>

/* This tells flex to read only one input file */
%option noyywrap

    /*** Rules section ***/

    /* [0-9]+ matches a string of one or more digits */
[0-9]+  {
            /* yytext is a string containing the matched text. */
            printf("Saw an integer: %s\n", yytext);

.|\n    {   /* Ignore all other characters. */   }

/*** C Code section ***/

int main(void)
    /* Call the lexer, then quit. */
    return 0;

If this input is given to flex, it will be converted into a C file, lex.yy.c. This can be compiled into an executable which matches and outputs strings of integers, for example, given the input:


the program will print:

Saw an integer: 123
Saw an integer: 2
Saw an integer: 6

Using Lex with other programming tools[edit]

Using Lex with parser generators[edit]

Lex and parser generators, such as Yacc or Bison, are commonly used together. Parser generators use a formal grammar to parse an input stream, something which Lex cannot do using simple regular expressions (Lex is limited to simple finite state automata). [clarification needed]

It is typically preferable to have a (Yacc-generated, say) parser be fed a token-stream as input, rather than having it consume the input character-stream directly. Lex is often used to produce such a token-stream.

Scannerless parsing refers to parsing the input character-stream directly, without a distinct lexer.

Lex and make[edit]

make is a utility that can be used to maintain programs involving Lex. Make assumes that a file that has an extension of .l is a Lex source file. The make internal macro LFLAGS can be used to specify Lex options to be invoked automatically by make.[6]

See also[edit]


  1. ^ Levine, John R.; Mason, Tony; Brown, Doug (1992). lex & yacc (2 ed.). O'Reilly. pp. 1–2. ISBN 1-56592-000-7. 
  2. ^ Levine, John (August 2009). flex & bison. O'Reilly Media. p. 304. ISBN 978-0-596-15597-1. 
  3. ^ Lesk, M.E.; Schmidt, E. "Lex – A Lexical Analyzer Generator". Retrieved August 16, 2010. 
  4. ^ Lesk, M.E.; Schmidt, E. (July 21, 1975). "Lex – A Lexical Analyzer Generator" (PDF). UNIX TIME-SHARING SYSTEM:UNIX PROGRAMMER’S MANUAL, Seventh Edition, Volume 2B. bell-labs.com. Retrieved Dec 20, 2011. 
  5. ^ Lesk, M.E. (October 1975). "Lex – A Lexical Analyzer Generator". Comp. Sci. Tech. Rep. No. 39. Murray Hill, New Jersey: Bell Laboratories. 
  6. ^ "make". The Open Group Base Specifications Issue 6, IEEE Std 1003.1, 2004 Edition. The IEEE and The Open Group. 2004. 

External links[edit]