6.3. Directives

This section contains a number of lines of the form:

%<directive name> <argument> ...

The statements here are all annotations to help Happy generate the Haskell code for the grammar. Some of them are optional, and some of them are required.

6.3.1. Token Type

%tokentype   { <valid Haskell type> }

(mandatory) The %tokentype directive gives the type of the tokens passed from the lexical analyser to the parser (in order that Happy can supply types for functions and data in the generated parser).

6.3.2. Tokens

%token <name> { <Haskell pattern> }
       <name> { <Haskell pattern> }
       ...

(mandatory) The %token directive is used to tell Happy about all the terminal symbols used in the grammar. Each terminal has a name, by which it is referred to in the grammar itself, and a Haskell representation enclosed in braces. Each of the patterns must be of the same type, given by the %tokentype directive.

The name of each terminal follows the lexical rules for Happy identifiers given above. There are no lexical differences between terminals and non-terminals in the grammar, so it is recommended that you stick to a convention; for example using upper case letters for terminals and lower case for non-terminals, or vice-versa.

Happy will give you a warning if you try to use the same identifier both as a non-terminal and a terminal, or introduce an identifier which is declared as neither.

To save writing lots of projection functions that map tokens to their components, you can include $$ in your Haskell pattern. For example:

%token INT { TokenInt $$ }
       ...

This makes the semantic value of INT refer to the first argument of TokenInt rather than the whole token, eliminating the need for any projection function.

6.3.3. Parser Name

%name <Haskell identifier> [ <non-terminal> ]
...

(optional) The %name directive is followed by a valid Haskell identifier, and gives the name of the top-level parsing function in the generated parser. This is the only function that needs to be exported from a parser module.

If the %name directive is omitted, it defaults to happyParse.

The %name directive takes an optional second parameter which specifies the top-level non-terminal which is to be parsed. If this parameter is omitted, it defaults to the first non-terminal defined in the grammar.

Multiple %name directives may be given, specifying multiple parser entry points for this grammar (see Section 2.7, “Generating Multiple Parsers From a Single Grammar”). When multiple %name directives are given, they must all specify explicit non-terminals.

6.3.4. Partial Parsers

%partial <Haskell identifier> [ <non-terminal> ]
...

The %partial directive can be used instead of %name. It indicates that the generated parser should be able to parse an initial portion of the input. In contrast, a parser specified with %name will only parse the entire input.

A parser specified with %partial will stop parsing and return a result as soon as there exists a complete parse, and no more of the input can be parsed. It does this by accepting the parse if it is followed by the error token, rather than insisting that the parse is followed by the end of the token stream (or the eof token in the case of a %lexer parser).

6.3.5. Monad Directive

%monad { <type> } { <then> } { <return> }

(optional) The %monad directive takes three arguments: the type constructor of the monad, the then (or bind) operation, and the return (or unit) operation. The type constructor can be any type with kind * -> *.

Monad declarations are described in more detail in Section 2.5, “Monadic Parsers”.

6.3.6. Lexical Analyser

%lexer { <lexer> } { <eof> }

(optional) The %lexer directive takes two arguments: <lexer> is the name of the lexical analyser function, and <eof> is a token that is to be treated as the end of file.

Lexer declarations are described in more detail in Section 2.5.2, “Threaded Lexers”.

6.3.7. Precedence declarations

%left     <name> ...
%right    <name> ...
%nonassoc <name> ...

These declarations are used to specify the precedences and associativity of tokens. The precedence assigned by a %left, %right or %nonassoc declaration is defined to be higher than the precedence assigned by all declarations earlier in the file, and lower than the precedence assigned by all declarations later in the file.

The associativity of a token relative to tokens in the same %left, %right, or %nonassoc declaration is to the left, to the right, or non-associative respectively.

Precedence declarations are described in more detail in Section 2.3, “Using Precedences”.

6.3.8. Expect declarations

%expect <number>

(optional) More often than not the grammar you write will have conflicts. These conflicts generate warnings. But when you have checked the warnings and made sure that Happy handles them correctly these warnings are just annoying. The %expect directive gives a way of avoiding them. Declaring %expect n is a way of telling Happy “There are exactly n shift/reduce conflicts and zero reduce/reduce conflicts in this grammar. I promise I have checked them and they are resolved correctly”. When processing the grammar, Happy will check the actual number of conflicts against the %expect declaration if any, and if there is a discrepancy then an error will be reported.

Happy's %expect directive works exactly like that of yacc.

6.3.9. Error declaration

%error { <identifier> }

Specifies the function to be called in the event of a parse error. The type of <f> varies depending on the presence of %lexer (see Section 2.5.4, “Summary”).

6.3.10. Attribute Type Declaration

%attributetype { <valid Haskell type declaration> }

(optional) This directive allows you to declare the type of the attributes record when defining an attribute grammar. If this declaration is not given, Happy will choose a default. This declaration may only appear once in a grammar.

Attribute grammars are explained in Chapter 4, Attribute Grammars.

6.3.11. Attribute declaration

%attribute <Haskell identifier> { <valid Haskell type> }

The presence of one or more of these directives declares that the grammar is an attribute grammar. The first attribute listed becomes the default attribute. Each %attribute directive generates a field in the attributes record with the given label and type. If there is an %attributetype declaration in the grammar which introduces type variables, then the type of an attribute may mention any such type variables.

Attribute grammars are explained in Chapter 4, Attribute Grammars.