Lex (software)
Lex is a computer program that generates lexical analyzers ("scanners" or "lexers").[1][2] It is commonly used with the yacc parser generator and is the standard lexical analyzer generator on many Unix and Unix-like systems. An equivalent tool is specified as part of the POSIX standard.[3] Lex reads an input stream specifying the lexical analyzer and writes source code which implements the lexical analyzer in the C programming language. In addition to C, some old versions of Lex could generate a lexer in Ratfor.[4] HistoryLex was originally written by Mike Lesk and Eric Schmidt[5] and described in 1975.[6][7] In the following years, Lex became standard lexical analyzer generator on many Unix and Unix-like systems. In 1983, Lex was one of several UNIX tools available for Charles River Data Systems' UNOS operating system under Bell Laboratories license.[8] Although originally distributed as proprietary software, some versions of Lex are now open-source. Open-source versions of Lex, based on the original proprietary code, are now distributed with open-source operating systems such as OpenSolaris and Plan 9 from Bell Labs. One popular open-source version of Lex, called flex, or the "fast lexical analyzer", is not derived from proprietary coding. Structure of a Lex fileThe structure of a Lex file is intentionally similar to that of a yacc file: files are divided into three sections, separated by lines that contain only two percent signs, as follows:
Example of a Lex fileThe following is an example Lex file for the flex version of Lex. It recognizes strings of numbers (positive integers) in the input, and simply prints them out. /*** Definition section ***/
%{
/* C code to be copied verbatim */
#include <stdio.h>
%}
%%
/*** Rules section ***/
/* [0-9]+ matches a string of one or more digits */
[0-9]+ {
/* yytext is a string containing the matched text. */
printf("Saw an integer: %s\n", yytext);
}
.|\n { /* Ignore all other characters. */ }
%%
/*** C Code section ***/
int main(void)
{
/* Call the lexer, then quit. */
yylex();
return 0;
}
If this input is given to abc123z.!&*2gj6 the program will print: Saw an integer: 123 Saw an integer: 2 Saw an integer: 6 Using Lex with other programming toolsUsing Lex with parser generatorsLex, as with other lexical analyzers, limits rules to those which can be described by regular expressions. Due to this, Lex can be implemented by a finite-state automata as shown by the Chomsky hierarchy of languages. To recognize more complex languages, Lex is often used with parser generators such as Yacc or Bison. Parser generators use a formal grammar to parse an input stream. It is typically preferable to have a parser, one generated by Yacc for instance, accept a stream of tokens (a "token-stream") as input, rather than having to process a stream of characters (a "character-stream") directly. Lex is often used to produce such a token-stream. Scannerless parsing refers to parsing the input character-stream directly, without a distinct lexer. Lex and makemake is a utility that can be used to maintain programs involving Lex. Make assumes that a file that has an extension of See alsoReferences
External linksThe Wikibook Guide to Unix has a page on the topic of: Commands
|