PLI Tutorial 4
Recursive descent and LALR(1) parsing

(Lm.n refers to Louden, Exercise m.n.)

Complete Tutorial 3.

Part A.  Recursive descent parsing.

1. Consider the Java program Calc.java provided in the Resources page
   to evaluate arithmetic expressions.  Modify this program to evaluate 
   expressions defined by the second grammar for expressions.  
   (You do not need to implement a proper lexical analyser for this 
   question.)

   (A solution is provided in program CalcRD.java in the Resources page,
   but you really should try it first before reading our solution.)

2. (L4.3) Given the grammar

      statement -> assignment | call | "other"
      assignment -> identifier ":=" exp
      call -> identifier "(" exp-list ")"
      
   for Pascal-like statements, transform the grammar to be LL(1) and write 
   a recursive descent parser for the transformed grammar in pseudcode.
      
3. (L4.4) Given the grammar

      lexp -> number | "(" op lexp-seq ")"
      op -> "+" | "-" | "*" | "/"
      lexp-seq -> lexp | lexp-seq
      
   for simple arithmetic expressions in Scheme-like prefix notation,
   write a recursive descent parser in pseudocode to compute the value
   of a given expression.

3. (L4.8) Consider the grammar

      lexp -> atom | list
      atom -> number | identifier
      list -> "(" lexp-seq ")"
      lexp-seq -> lexp | lexp-seq lexp
      
   for Scheme expressions.  Ttransform the grammar so that it is LL(1).
   Compute the first and follow sets for the nonterminal symbols of the
   resulting grammar and show that it is indeed LL(1).  Write a recursive
   descent parser for the resulting grammar.

4. Study the Tiny parser parse.c and related files.

5. (L4.22) Rewrite the Tiny grammar so that only Boolean expressions
   are allowed as the test of an if- or repeat-statement, and only
   integer expressions are allowed in write- or assign-statements
   or as operands of any of the operators.
   
6. (L4.23-24) Add Boolean operators "and", "or" and "not" to the Tiny grammar.
   Give them the normal precedences and associativity and lower precedence
   than all arithmetic operators.  Ensure that the conditions of the
   previous question are still satisfied.

Part B.  LALR(1) parsing.

1. Consider the following grammar for balanced parenthesis strings 
   (e denotes the empty sequence):

   S -> ( S ) S | e

   (a) State whether or not the grammar is LL(1) and justify your
	   answer.

   (b) Construct the LR(0) states and the SLR(1) parsing tables 
	   for the grammar, using the method in the lecture notes.  
	   (The resulting states should be identical to those in Figure 5.3
	   and the resulting parsing tables should be equivalent to those 
	   in Table 5.7.)

   (c) State whether or not the grammar is LR(0) and justify your
	   answer.

   (d) Show the (stack) automata transitions when your parsing 
	   tables are used to parse the input string "(())()$".

2. Consider the following grammar for arithmetic expressions 
   (a denotes a number or an identifier):
   
   E -> T | E + T 
   T -> F | T * F 
   F -> a | ( E )

   (a) State whether or not the grammar is LL(1) and justify your
	   answer.

   (b) Construct the LR(0) states and the SLR(1) parsing tables 
	   for the grammar, using the method in the lecture notes.  

   (c) State whether or not the grammar is LR(0) and justify your
	   answer.

   (d) State whether or not the grammar is SLR(1) and justify
	   your answer.

   (e) Show the (stack) automata transitions when your parsing 
	   tables are used to parse the input string "(a+a)*a+a$".

3. Consider the following grammar for C-style expressions, using
   variables and pointer dereferencing (:=, a and * are terminals):

   S -> V := E | E
   E -> V
   V -> a | * E

   (a) Construct the LR(0) states for the grammar, using the method
	   in the lecture nots.

   (b) Explain why the grammar is not an SLR(1) grammar.

   (c) (Later) Use Yacc to construct an LALR(1) parser for the 
   grammar, and hence show it is an LALR(1) grammar.

4. Consider the following grammar (e denotes the empty sequence; 
   (, ), [ and ] are the terminals).

   S -> ( X | E ] | F )
   X -> E ) | F ]
   E -> A
   F -> A
   A -> e

   (a) Explain informally the language defined by this grammar.

   (b) Show that the grammar is LL(1) but not SLR(1).

   (c) This grammar is admittedly contrived.  Can you find a more
	   natural grammar that is LL(1) but not SLR(1) ?