PLI Lecture 8
Run-time environments II: Procedures and parameters 
in stack-oriented languages

Memory organisation, stack frames, environments, procedure call and return
sequences, variable look-up, static and dynamic links, parameter passing 
methods, data representation and access, dynamic memory management, 
allocate and release, garbage collection.  (Phew!)

1. BASIC MEMORY ORGANISATION

(This section extends the Scheme meta-circular evaluators,
by providing more information on how control is managed,
without assuming an underlying Scheme platform.)

Code is stored separately from data.

Data is divided into:
* Global/static area, for global variables
* Stack, for parameters and local variables of procedures
* Heap, for dynamically allocated data

General organisation:
Code - globals - stack - free space - heap

(Stack growing to right)

The parameters and local variables of a single procedure
call are stored in a frame or activation record (on the stack):

Parameter values
Bookkeeping information, including return address
Local variables
Temporary values

(Stack growing downwards)

In Fortran 77 (without recursive procedures, or dynamic
storage allocation), all data could be stored globally,
global data, plus one frame for each procedure.

In this case, every (local) variable had a fixed address,
known at compile time, computed during semantic analysis.

(Those were the good ol' days.)

In languages in which all (possibly recursive) procedures 
are global (C, but not Java or Scheme), the bookkeeping 
information of each frame must contain:
* return address: 
    instruction to goto after this procedure call
* control or dynamic link:
    pointer to frame for calling procedure
These comprise the bookkeeping information of a frame.

The pointer to the current frame is called the frame pointer
(doh) or fp, often stored in a register (for fast access).

For convenience (see below), fp usually points to the 
bookkeeping section of the frame.

The pointer to the stack top (or bottom) is called the
stack pointer (doh) or sp, also stored in a register.

See Louden, Example 7.2, pp.353-354:

#include 

int x, y;

int gcd(int u, int v) {
  if (v == 0) return u;
  else return gcd (v, u%v);
}

void main() {
  scanf("%d%d, &x, &y);
  printf("%d\n", gcd(x,y));
}

Input: 15 10

Figure 7.4 shows the stack state after the calls
main() -> gcd(15,10) -> gcd(10,5) -> gcd(5,0)

See Louden, Example 7.3, pp,354-356 for a more complex 
case involving static local variables (which must be 
stored in the global area).

Now, parameters and local variables are accessed by 
(constant) offsets from the current frame pointer.
(Computer addressing modes make this efficient.)

See Louden, Example 7.4, pp.357-358. for an example.

2. BASIC PROCEDURE CALL AND RETURN SEQUENCES

Procedure call sequence:
1. Evaluate arguments and push onto stack.
2. Store fp as control link of new frame.
3. Set fp to point to new frame.  (E.g., fp = sp;)
4. Set sp to new stack top (push a new frame).
5. Store the return address in the new frame.
6. Jump to start of called procedure.

Procedure return sequence:
1. Set sp to fp. (Pop the current frame.)
2. Load the control link into fp.
3. Jump to the return address.
4. Update sp to pop the arguments.

See Louden, Example 7.5, pp.359-361 for an example.
Consider the stack states during the first, second 
and third calls of function g().

These operations must be divided between the calling
and called procedures.

Calling procedure (caller):
On entry:
1. Evaluate arguments and push onto stack.
2. Store fp as control link of new frame.
3. Store the return address in the new frame.
4. Jump to entry of called procedure.

On exit:
1. Use returned value.

Called procedure (callee):
1. Set fp to point to new frame.
2. Set sp to new stack top (push new frame).
Execute instructions of procedure.
1. Store returned value.
2. Set fp from control link to old value.
3. Set sp to old stack top (pop current frame).
4. Jump (indirectly) to return address.

Details vary depending on exact frame representation
and on instruction set.

E.g., the returned value is often stored in a register.

When setting sp on pushing a new frame, must leave space
for local variables.

Note that both parameters and local variables may be of
arbitrary size (e.g., structs, arrays), and that the size
may vary from call to call (e.g., if local array size
depends on a parameter).  (This latter feature requires
special techniques.)

3. BASIC PARAMETER AND LOCAL VARIABLE ACCESS

Parameters and local variables are normally accessed as
constant offets from the frame pointer.  Parameters and
local variables are normally on opposite sides of the 
frame pointer.  These offsets are computable, during
semantic analysis, at compile time.

In C, we do NOT use a separate stack frame for every
nested block, i.e., compound statement.  We treat all
local variables of nested blocks as variables of the
current function.

Hence, in C, all (real) nonlocal variables are global.
Global variables are accessed as a constant offset from
a (new) global pointer.

Procedural parameters can simply be passed as pointers
(to their first instruction).

4. NESTED PROCEDURES

Many languages (e.g., Scheme, Haskell, Pascal, Ada) 
allow nested procedures and use static scope rules.
In such languages, each procedure is declared at a 
particular nesting level, starting from 0.

See Louden, Fig. 7.8 (p.366):

program main;

procedure p;
  var n: integer;

  procedure q;
  begin (* nonlocal ref. to n *) end;

  procedure r(n: integer);
  begin q; end;

begin n := 1; r(2); end; (* p *)

begin (* main *) p; end.

With static scope rules, any reference to n inside q
refers to the n declared in p, not the parameter n of r.
We have already studied similar examples in Scheme.

This requires an additional link called the static or
access link in the bookkeeping part of each stack frame.
This link points to the most recent frame for the textually
enclosing procedure of the called procedure.

See Louden, Fig. 7.10 (p.367).

Nonlocals may be declared in procedures (more generally
blocks), more than one level outside the procedure 
containing their reference.

See Louden, Figs 7.11 and 7.12 (pp.368-369).

Now, to access a nonlocal variable x declared in 
a procedure P at nesting level i from a procedure Q 
at nesting level j (i <= j), follow j-i static links 
to the frame for procedure P (which contains the value 
of x), then access x at a constant offset from the resulting 
pointer.  (Note the special case i = j.)

Nesting levels can be computed during semantic analysis,
and stored (in the symbol table, or structure tree) with
the variable.  So the number of static links to follow
is known, and is usually small in practice.

The procedure call and return sequences must be modified.

On procedure calls, the static link must be pushed onto
the stack just before the dynamic link fp.  On procedure
return, the stack pointer must be decremented by an
additional amount to remove the static link as well as
the parameters.

Suppose a procedure at level k calls a procedure of level i,
where i<= k+1.  Then we follow k-i+1 static links to find
the new static link (to store in the new frame).

See Louden, Fig. 7.13 (p. 370).

Procedural parameters must now be passed as closures,
consisting of an instruction pointer ip, and an environment
pointer ep (cf. the final Scheme interpreter).

See Louden, Example 7.7 (pp. 371-373).  Consider the stack
state after the second call to r in Fig. 7.11.

Proecural parameters in languages with nested scope rules
can be implemented using a stack-based environment.
Procedural values in such languages require heap-based
storage for environments (which is the main reason they
are not provided in many languages such as C and Java).

5. PARAMETER PASSING METHODS

(Louden, Section 7.5)

It's important to distinguish between parameters
passed:
* by value (pass value of argument)
* by reference (pass address of argument, and use
  during procedure call)
* by result (pass address of argument, use local variable
  during procedure call, assign final value of local 
  variable to argument)
* by value-result (pass address of argument, assign
  value of argument to local variable, use local
  variable during procedure call, assign final value
  of local variable to argument)
* by name (now rare, thankfully, so omitted)

Parameters may sometimes be procedures (but not in
Java).  In C, it suffices to pass a pointer (address)
of a function.  In languages with nested procedures
(e.g., Scheme, Pascal), it's more interesting.

Consider Louden, Example 7.7.

program closure(output);
  procedure p(procedure a);
  begin a; end;
  
  procedure q;
    var x: integer;
    procedure r;
    begin writeln(x); end;
    
  begin x := 2; p(r); end;
  
begin q; end.

Note that when parameter a is called, it's value is 
a "closure" consisting of the procedure entry
for the argument r and the environment of r.  I.e.,
when we pass argument r, we pass a pair containing
the entry point of r and the access link to procedure q.  
When we call the procedural parameter a, we set the 
access link in the new activation record to the access 
link in the closure, and we jump to the entry point 
in the closure.

Procedural values in languages with nested scope rules
require heap-based storage for environments, or require
copying the environment to create a closure (which is 
the main reason they are not provided in many languages 
such as C and Java).

Procedure values (e.g., in Scheme, Haskell) add another
level of complexity.

6. DYNAMIC STORAGE MANAGEMENT

(Louden, Section 7.4)

* Heap management
* Garbage collection