PLI Lecture 1
Introduction to PLI and to 3516ICT
* Why study PLI?
- Frequent need to implement small PLs
- (Occasional need to implement large PLs)
- Helps understand how to design PLs (and computers)
- Helps understand how to use PLs (and PLIs)
- Excellent case study in SE
* PLI history:
- First (Fortran) compiler, ca. 1955
- First Lisp interpreter, ca. 1960
- First C compiler, ca. 1970
- First popular parser generators, yacc, ca. 1975
- First popular pipelined target machines, ca 1990
- First IDEs, Borland, VB, VC++, ca 1990, 1995
- First complete compiler generators, Eli, ca. 2000
- Recent progress most active in code generation
for increasingly complex target machines
* How are PLs implemented?
- Interpretation: "Direct" execution of source language
(I diagrams)
- Compilation: Translation of source language to
target languages (T diagrams)
* Source languages
- Compilation techniques are very different
for procedural, OO, functional and logic
languages
- We shall focus on procedural and OO languages
* Target languages
- Real: I86, PowerPC, Sparc, Cell, ...
- Virtual: C, JVM, .NET CLR, MMIX, TM, ...
* Traditional Lisp (Scheme) metacircular interpreters
- Proposed ca. 1958 by John McCarthy at MIT
- Current realisations: Common Lisp, Scheme
- See H. Abelson and G.J. Sussman, Structure and Interpretation
of Computer Programs, Second Edition, MIT Press, 1996.
- See Section 4.1: The Metacircular Evaluator
- Example Scheme function definitions
- Metacircular evaluator structure
* Generic compiler architecture
- Preprocessing
- Analysis
. Lexical analysis (scanning)
. Syntactic analysis (parsing)
. Semantic analysis
+ Name identification
+ Type identification
- Synthesis
. Semantic analysis (target attribution)
. Code generation (intermediate code, target code)
. Optimisation
- Module interfaces are well defined
- Modules are _not_ executed sequentially
. E.g., lexical analysis and parsing are coroutines,
optimisation is done before _and_ after code generation
- See control flow
* Intermediate representations (much variation)
- Source program
. E.g., "a[index] = 14+2 // an assignment"
- Token sequence
. E.g., "a" "[" "index" "]" "=" "14" "+" "2"
- Parse trees
. See tree 1 and tree 2
- Structure trees
. See tree 3
- Intermediate code program
. See later lecture
- Assembly language program
. MOV R0, index
. SHL R0
. MOV &a(r0), 6
- Machine language program
- (Finally, linking and loading is required.)
* Global data structures
- Literal (constant) table
- Symbol (name) table
* Other issues:
. Error reporting and recovery
. Software (product) quality issues
. Optimisation:
- Speed of compilation vs speed of generated code
- Space requirements
* Running example
- The Tiny language
. Simple procedural language
. No floats, arrays, procedures
. Example:
{ Sample Tiny program - computes factorial }
read x; { input an integer x }
if x > 0 then
fact := 1;
repeat
fact :- fact * x;
x := x - 1
until x = 0;
write fact { output factorial of x }
- The tiny compiler
. Implemented in C - review C!
. Files:
globals.h main.c
util.h util.c
scan.h scan.c
parse.h parse.c
symtab.h symtab.c
analyze.h analyze.c
code.h code.c
cgen.h cgen.c
. main.c:
syntaxTree= parse();
buildSymtab(syntaxTree);
typeCheck(syntaxTree);
codeGen(syntaxTree,codefile);
- The TM machine (sic)
. Very simple register machine
. Simulator available
. Important exercise:
Hand compile the above Tiny program into an
equivalent TM assembly language program.
- Specifications of Tiny and TM are available
on the Web site.
* Second running example
- The Scheme language
. Modern dialect of Lisp, a simple, expressive programming language
that encourages a functional programming style.
. Available on dwarf (guile) and PCs (PLT Scheme).
. Learn Scheme!
- A Scheme interpreter in Scheme
. Avoids scanning and parsing and some data representation issues
. Focuses on implementation of control structures
* About 3516ICT