PLI Lecture 8 Run-time environments II: Procedures and parameters in stack-oriented languages Memory organisation, stack frames, environments, procedure call and return sequences, variable look-up, static and dynamic links, parameter passing methods, data representation and access, dynamic memory management, allocate and release, garbage collection. (Phew!) 1. BASIC MEMORY ORGANISATION (This section extends the Scheme meta-circular evaluators, by providing more information on how control is managed, without assuming an underlying Scheme platform.) Code is stored separately from data. Data is divided into: * Global/static area, for global variables * Stack, for parameters and local variables of procedures * Heap, for dynamically allocated data General organisation: Code - globals - stack - free space - heap (Stack growing to right) The parameters and local variables of a single procedure call are stored in a frame or activation record (on the stack): Parameter values Bookkeeping information, including return address Local variables Temporary values (Stack growing downwards) In Fortran 77 (without recursive procedures, or dynamic storage allocation), all data could be stored globally, global data, plus one frame for each procedure. In this case, every (local) variable had a fixed address, known at compile time, computed during semantic analysis. (Those were the good ol' days.) In languages in which all (possibly recursive) procedures are global (C, but not Java or Scheme), the bookkeeping information of each frame must contain: * return address: instruction to goto after this procedure call * control or dynamic link: pointer to frame for calling procedure These comprise the bookkeeping information of a frame. The pointer to the current frame is called the frame pointer (doh) or fp, often stored in a register (for fast access). For convenience (see below), fp usually points to the bookkeeping section of the frame. The pointer to the stack top (or bottom) is called the stack pointer (doh) or sp, also stored in a register. See Louden, Example 7.2, pp.353-354: #includeint x, y; int gcd(int u, int v) { if (v == 0) return u; else return gcd (v, u%v); } void main() { scanf("%d%d, &x, &y); printf("%d\n", gcd(x,y)); } Input: 15 10 Figure 7.4 shows the stack state after the calls main() -> gcd(15,10) -> gcd(10,5) -> gcd(5,0) See Louden, Example 7.3, pp,354-356 for a more complex case involving static local variables (which must be stored in the global area). Now, parameters and local variables are accessed by (constant) offsets from the current frame pointer. (Computer addressing modes make this efficient.) See Louden, Example 7.4, pp.357-358. for an example. 2. BASIC PROCEDURE CALL AND RETURN SEQUENCES Procedure call sequence: 1. Evaluate arguments and push onto stack. 2. Store fp as control link of new frame. 3. Set fp to point to new frame. (E.g., fp = sp;) 4. Set sp to new stack top (push a new frame). 5. Store the return address in the new frame. 6. Jump to start of called procedure. Procedure return sequence: 1. Set sp to fp. (Pop the current frame.) 2. Load the control link into fp. 3. Jump to the return address. 4. Update sp to pop the arguments. See Louden, Example 7.5, pp.359-361 for an example. Consider the stack states during the first, second and third calls of function g(). These operations must be divided between the calling and called procedures. Calling procedure (caller): On entry: 1. Evaluate arguments and push onto stack. 2. Store fp as control link of new frame. 3. Store the return address in the new frame. 4. Jump to entry of called procedure. On exit: 1. Use returned value. Called procedure (callee): 1. Set fp to point to new frame. 2. Set sp to new stack top (push new frame). Execute instructions of procedure. 1. Store returned value. 2. Set fp from control link to old value. 3. Set sp to old stack top (pop current frame). 4. Jump (indirectly) to return address. Details vary depending on exact frame representation and on instruction set. E.g., the returned value is often stored in a register. When setting sp on pushing a new frame, must leave space for local variables. Note that both parameters and local variables may be of arbitrary size (e.g., structs, arrays), and that the size may vary from call to call (e.g., if local array size depends on a parameter). (This latter feature requires special techniques.) 3. BASIC PARAMETER AND LOCAL VARIABLE ACCESS Parameters and local variables are normally accessed as constant offets from the frame pointer. Parameters and local variables are normally on opposite sides of the frame pointer. These offsets are computable, during semantic analysis, at compile time. In C, we do NOT use a separate stack frame for every nested block, i.e., compound statement. We treat all local variables of nested blocks as variables of the current function. Hence, in C, all (real) nonlocal variables are global. Global variables are accessed as a constant offset from a (new) global pointer. Procedural parameters can simply be passed as pointers (to their first instruction). 4. NESTED PROCEDURES Many languages (e.g., Scheme, Haskell, Pascal, Ada) allow nested procedures and use static scope rules. In such languages, each procedure is declared at a particular nesting level, starting from 0. See Louden, Fig. 7.8 (p.366): program main; procedure p; var n: integer; procedure q; begin (* nonlocal ref. to n *) end; procedure r(n: integer); begin q; end; begin n := 1; r(2); end; (* p *) begin (* main *) p; end. With static scope rules, any reference to n inside q refers to the n declared in p, not the parameter n of r. We have already studied similar examples in Scheme. This requires an additional link called the static or access link in the bookkeeping part of each stack frame. This link points to the most recent frame for the textually enclosing procedure of the called procedure. See Louden, Fig. 7.10 (p.367). Nonlocals may be declared in procedures (more generally blocks), more than one level outside the procedure containing their reference. See Louden, Figs 7.11 and 7.12 (pp.368-369). Now, to access a nonlocal variable x declared in a procedure P at nesting level i from a procedure Q at nesting level j (i <= j), follow j-i static links to the frame for procedure P (which contains the value of x), then access x at a constant offset from the resulting pointer. (Note the special case i = j.) Nesting levels can be computed during semantic analysis, and stored (in the symbol table, or structure tree) with the variable. So the number of static links to follow is known, and is usually small in practice. The procedure call and return sequences must be modified. On procedure calls, the static link must be pushed onto the stack just before the dynamic link fp. On procedure return, the stack pointer must be decremented by an additional amount to remove the static link as well as the parameters. Suppose a procedure at level k calls a procedure of level i, where i<= k+1. Then we follow k-i+1 static links to find the new static link (to store in the new frame). See Louden, Fig. 7.13 (p. 370). Procedural parameters must now be passed as closures, consisting of an instruction pointer ip, and an environment pointer ep (cf. the final Scheme interpreter). See Louden, Example 7.7 (pp. 371-373). Consider the stack state after the second call to r in Fig. 7.11. Proecural parameters in languages with nested scope rules can be implemented using a stack-based environment. Procedural values in such languages require heap-based storage for environments (which is the main reason they are not provided in many languages such as C and Java). 5. PARAMETER PASSING METHODS (Louden, Section 7.5) It's important to distinguish between parameters passed: * by value (pass value of argument) * by reference (pass address of argument, and use during procedure call) * by result (pass address of argument, use local variable during procedure call, assign final value of local variable to argument) * by value-result (pass address of argument, assign value of argument to local variable, use local variable during procedure call, assign final value of local variable to argument) * by name (now rare, thankfully, so omitted) Parameters may sometimes be procedures (but not in Java). In C, it suffices to pass a pointer (address) of a function. In languages with nested procedures (e.g., Scheme, Pascal), it's more interesting. Consider Louden, Example 7.7. program closure(output); procedure p(procedure a); begin a; end; procedure q; var x: integer; procedure r; begin writeln(x); end; begin x := 2; p(r); end; begin q; end. Note that when parameter a is called, it's value is a "closure" consisting of the procedure entry for the argument r and the environment of r. I.e., when we pass argument r, we pass a pair containing the entry point of r and the access link to procedure q. When we call the procedural parameter a, we set the access link in the new activation record to the access link in the closure, and we jump to the entry point in the closure. Procedural values in languages with nested scope rules require heap-based storage for environments, or require copying the environment to create a closure (which is the main reason they are not provided in many languages such as C and Java). Procedure values (e.g., in Scheme, Haskell) add another level of complexity. 6. DYNAMIC STORAGE MANAGEMENT (Louden, Section 7.4) * Heap management * Garbage collection