![]() |
The Accent Compiler Compiler
The ACCENT Grammar Language |
Accent Overview Tutorial Language Installation Usage Lex Algorithms Distribution |
ConventionsThe Accent Grammar Language is described by rules of the formN : M11 M12 ... | M21 M22 ... ... ;which state that a phrase for N is composed from phrases for M11 and M12 and ... or from phrases for M21 and M22 ... , etc. Terminal symbols are enclosed in double quotes, e.g. "%out". In addition, the terminal symbol identifier denotes a sequence of one or more letters, digits, and underscores ("_"), starting with a letter. The terminal symbol number denotes a sequence of one or more digits. The terminal symbol character_constant denotes a character constant as in the C language. The terminal symbol c_code represents arbitrary C code (comments in this code must be closed and curly braces much match). Grammargrammar : global_prelude_part token_declaration_part rule_part ;The Accent Grammar Language is the set of phrases for the symbol grammar. Global Preludeglobal_prelude_part : global_prelude | empty ; global_prelude : "%prelude" block ; block : "{" c_code "}" ; empty : ;The optional global_prelude_part serves to introduce user defined functions, global variables, and types. The text enclosed in curly braces is inserted verbatim at the beginning of the generated program file. Token Declarationstoken_declaration_part : "%token" token_declaration_list ";" | empty ; token_declaration_list : token_declaration "," token_declaration_list | token_declaration ; token_declaration : identifier ;The optional token_declaration_part introduces symbolic names for terminal symbols (tokens). A name must not appear more than once in the list. These names may be used as members of grammatical rules. The actual representation of the corresponding terminal symbols must be defined by lexical rules that are not part of the Accent specification. As opposed to nonterminal symbols, terminal symbols are declared without parameters. Nevertheless they have an implicit output parameter of type YYSTYPE which (if used) must be defined in the corresponding lexical rule. (See Using Lex with Accent for a discussion.) Rule Partrule_part : rule_list ; rule_list : rule rule_list | rule ; rule : left_hand_side ":" right_hand_side ";" ;A nonterminal is defined by a rule that lists one or more alternatives how to construct a phrase of the nonterminal. The first rule specifies the start symbol of the grammar. The language defined by the grammar is given by the phrases of the start symbol. Left Hand Sideleft_hand_side : nonterminal formal_parameter_spec ; nonterminal : identifier ;The left_hand_side of a rule introduces the nonterminal that is defined by the rule. It also specifies parameters of the nonterminal, they represent the semantic attributes of the nonterminal.
Example
The value of these parameters must be defined by semantic actions in the alternatives of the body of the rule. When the nonterminal is used as a member in the body of a rule, actual parameters are attached. Using theses parameters, the attributes of the corresponding nonterminal can be accessed.
Example
formal_parameter_spec : empty | "<" parameter_spec_list ">" | "<" "%in" parameter_spec_list ">" | "<" "%out" parameter_spec_list ">" | "<" "%in" parameter_spec_list "%out" parameter_spec_list ">" ; parameter_spec_list : parameter_spec "," parameter_spec_list | parameter_spec ;Parameters may be of mode in or mode out. If no mode is specified, all parameters are of mode out. Otherwise, parameters are of mode in if they appear in a list preceded by %in; they are of mode out if the list is preceded by %out. An in parameter (inherited attribute) passes a value from the application of a nonterminal to the right hand side defining the symbol. It is used to pass context information to a rule. An out parameter (synthesized attribute) passes a value from the right hand side defining a symbol to the application of the symbol. It is used to pass the semantic value of a rule to the context.
Example
parameter_spec : parameter_type_opt parameter_name ; parameter_type_opt : parameter_type | empty ; parameter_type : identifier ; parameter_name : identifier ;A parameter specification may be written in the form type name in which case type is the type of the parameter name. If the type is missing, the parameter is of type YYSTYPE (which is also the type of tokens). YYSTYPE is equivalent to long if not defined by the user. (See the Using Lex with Accent how to define YYSTYPE.) The start symbol of the grammar must have no parameter. Right Hand Sideright_hand_side : local_prelude_option alternative_list ;The right hand side of a rule specifies a list of alternatives. This list may be preceded by a prelude that introduces common declarations and initialisation statement in C. local_prelude_option : local_prelude | empty ; local_prelude : "%prelude" block ;In the generated program the content of block (without the enclosing parentheses) precedes the code generated for the alternatives of the rule. The items declared in the prelude are visible within all alternatives. alternative_list : alternative "|" alternative_list | alternative ; alternative : member_list alternative_annotation_option ; member_list : member member_list | empty ; alternative_annotation_option : alternative_annotation | empty ; alternative_annotation : "%prio" number ; member : member_annotation_option item ; member_annotation_option : member_annotation | empty ; member_annotation : "%short" | "%long" ; item : symbol | literal | grouping | option | repetition | semantic_action ;The alternatives appearing on the right hand side of a rule specify how to construct a phrase for the nonterminal of the left hand side. An alternative is a sequence of members. These members may be nonterminal symbols, token symbols, or literals (terminal symbols that appear verbatim in the grammar). The right hand side may be written as an regular expression constructed by grouping, option, and repetition. At all places semantic actions may be inserted. Ambiguities in the grammar may be resolved by annotating alternatives and members. If two alternatives of a nonterminal can produce the same string then both alternatives must be postfixed by an annotation of the form %prio NN defines the priority of the alternative. The alternative with the higher priority is selected. If the same alternative can produce can produce the same string in more than one way because members of that alternative can cover substrings of that string of different length, the rightmost of these members must be prefixed with an annotation of the form %shortor %longIf the member is prefixed by "%short" (resp. "%long") the variant that produces the short (resp. long) substring is selected. Nonterminal and Terminal Symbolssymbol : symbol_name actual_parameters_option ; symbol_name : identifier ;The symbol name must be declared as a nonterminal (by specifying a rule for the identifier) or as a token (by listing the identifier in the token declaration part). actual_parameters_option : actual_parameters | empty ; actual_parameters : "<" actual_parameter_list ">" ; actual_parameters_list : actual_parameter "," actual_parameter_list | actual_parameter ; actual_parameter : identifier ;For each formal parameter of the symbol there must be a corresponding actual parameter. A parameter must be an identifier. In the generated C code, this identifier is declared as a variable of the type of the corresponding formal parameter. The same parameter name may be used at different places but then the type of the positions must be identical. literal : character_constant ;Besides being declared as a token, a terminal symbol can also appear verbatim as a member of rule. Structured Membersgrouping : "(" alternative_list ")" ;A construct ( alt_1 | alt_2 | ... )matches a phrase generated by the alternatives alt_i option : "(" alternative_list ")?" ;A construct ( alt_1 | alt_2 | ... )?matches the empty phrase or a phrase generated by the alternatives alt_i repetition : "(" alternative_list ")*" ;A construct ( alt_1 | alt_2 | ... )*matches the empty phrase or any sequence of phrases generated by the alternatives alt_i Semantic Actionssemantic_action : block ;Semantic actions may be inserted as members of alternatives. They do not influence the parsing process. Semantic actions can contain arbitrary C code enclosed in curly braces. This code is executed in a second phase after the parsing process. The semantic actions of selected alternatives are executed from left to right in the given order. Output parameters of preceding symbols may be accessed in the semantic action. Input parameters of following symbols must be defined. Parameters are accessed by specifying their names. The name of the output parameters of the left hand side must be preceded by a dereferencing operator ('*').
In the generated program
the curly braces enclosing the action do not appear in the generated program
(hence a semantic action at the beginning of an alternative
may contain declarations of variables that local to the alternative).
|