-------------------------------------------------------------------------------
Three basic ways to traverse a Tree Structure

     Depth                Top-Down                Breath
        7                    1                       1
      /   \                /   \                   /   \
     3     6              2     5                 2     3
    / \   / \            / \   / \               / \   / \
   1   2 4   5          3   4 6   7             4   5 6   7

  Recurse each         Do current node     For each node on a list
     child             Recurse on each       Do Node
  Do current node         child node         Append all children to
                                                end of queued list

   (Recursive)           (Recursive)             (Iteritive)

Also see  "info/perl/file_traversal_notes"
for Directory structures.

===============================================================================
                      Polymorphic Data Structures
-------------------------------------------------------------------------------
Glossary:
    Data structure:  The complex data storage for multiple types of elements.
    structure:       short for data structure -- not a C structure.
    C structure:     C `stuct' or `union' data type.
    Element:         The user data that we want to store.
    Declaration:     Declaration of the data structure
    Instance:        Definition of one data stucture for actual use.
    Polymorphic DS   A Data Structure which could be used with any data type.

-------------------------------------------------------------------------------
Problems and techniques.

Problems associated with this

 1/ Storing variable types of elements (few languages provides a generic type)
    Insertion:
      How should elements be inserted into the data stucture?
            As a pointer to elements only?
            By assignment?
            By memory copying?
      Similarly what data types can be inserted into the structure?
            Integer types only (char,short,int, long)?
            Pointers only  (void *)?
            Basic C data types (int, float, char *)?
            Arrays (char [80])?
            Actual C Structures?
            Variable sized C Structures?

      For example, with a basic type such as an integer you would expect
      the data structure to just make a copy of it and not use pointers.
      Similarly with a pointer. But what if an actual C structure is given
      and not a pointer to a structure? Should it be copied, or just
      disallowed. The last is very dependant on the compiler available.

        What if the data type is highly variable in length and the user
      only wants to allocate just enough memory required for each of his
      data elements to be stored.  In the X window system, for example,
      has a number of such element. For instance `XEvent' and `Widget',
      are pointers to such variable C structures.

    Lookup:
        How does the data structure return a copy of the element stored in
      the data structure. This  Does the user need to free the element
      returned or would this cause it to free the copy of the element
      still in the structure as well.

    Destruction/Removal:
        The problem with element type destruction is that only the user
      know how to destroy the element data he provides. Many data
      structure inplimentations for pointers assume that to destroy the
      item stored you only need to free the structure pointed to. WRONG!

      Some of the problems with this are:
           Pointers associated information.
             EG: to a linked list of lines for a polygon data object.
           Pointers to related but separate structures.
             EG: a pointer to the data structure the element is stored in.
           The element may also be referenced elsewhere.
             EG: It may be in another different data structure.

        In short only the user knows how to (or if) to destory the data
      stored in the data structure. How does the data structure do this?
      The best and only good answer for this, is to follow a good programming
      rule of thumb.  I have seen many programs

           ``Whoever allocated/created it must itself free it''

      If the above rule is to be broken, it must be verbosely written
      as being so in manuals, proceedural comments and general comments.

      IE: if the user allocated the stored data, it must be given back to
      him/she to be freed. I have seen many routines (not just data
      structures) which allocate memory, but either never free it or tell
      the user how to free it. For example: many manuals mention the
      library `strdup' function to duplicate a string but do not mention
      that you must call `free' when you are finish using it. A good
      memory debugging library usually finds these problems.

        This brings us to the main problem, that is usably ignored by most
      implimentations I have seen. Destroying a instance of the data
      structure completely when finished. In this case you can usually use
      a technique to very quickly retrieve elements without having to
      ensure the data structure remains in correct usable order.

      The only good techiques I have found are.

        1/ Destroy instance call with a destory element proceedure as an
               argument for the user to supply.

        2/ Provide a trio of functions (macros) to destory the structure
               destory_start(ds);
               while( (data = destroy_element(ds)) != NullValue ) {
                 /* destroy user data here */
               }
               destroy_end(ds);

    Key Search & Comparisions:
        Though not as critical as the other operations above, it is still
      important. In designing a data structure with macros, all the other
      operations above can actually be performed inside the macros. Thus
      solving any typing problem that may exist. Comparisions however are
      part of the search loop of the structure and thus can't be placed in
      a macro.

        How should the users comparision function be called. What should
      it supply, pointers to the stored data like the qsort() library
      function, or more logically a copy of the data stored so it couldn't
      possibably be changed.  Can the comparasion function be different
      for different instances?  Does the comparision function need to be
      provided on every operation or only when an instance of a data
      structure is declared?


 2/ Does the user need to provide any special provisions to the element
    he wants to store in the stucture?

      For example, does he need to add a `next' field to his C structure
    for a linked list? Can he specify the a field name for this use,
    so the data can be stored in multiple structures of the same type.


 3/ Polymophism of the data structure (how contrived is it?)

    Can you provide multiple and separate instances of the data structure?
    Can each instance use a different element types from each other?
    Does a different data type require the procedures to be duplicated?

      * One element type only, one instance only, per program.

        The acutal data type is usually defined in the structures header
        file at a specific location.

        This is the the most common type of data stuturure and the
        simplest to both use and write. In fact it is just a typlical
        direct implementation of a specific data type with the type being
        modified by a defined macro.

      * One element type per coded module of the data structure.

          This is generally achieved by compiling the data structures
        module twice with a different set of defines, to produce a
        different set of procedures for each element type required.

          The only additional complexity required for the encoding over
        that previously is to use the macros to re-label the procedures
        and structure definitions using either the type being stored or
        more commonly a string given my the user.

          This method can be carried even further by encoding all the
        proceedure and data structure definitions into a single marco
        definition. That macro definition is then inserted by the
        programmer into is code as required. The macro call is supplied it
        with the names (or part of the names) of the data structures,
        procedures, and element type to be used.

          The best example of this I have seen is a ``general linked list
        sorter'' proceedure. The programmer just includes a header at the
        top of his code and then adds a single macro call, suppling: the
        name of the produre to create; the data type of the elements in
        the users linked list; and the name of the field in the element
        structure which pointes to the next element. The macro then
        expands into two recursive procedures which does a merge sort on
        any linked link of the type given. The comparision function is
        given on the call to the actual sorting proceedure created.

          Encoding all the procedures needed in a single macro is
        probably extreme, and unwarrented, but it is not too difficult
        to do and results in only a single include header file.


      * One code that handles multiple instances of Multiple element types.

          This is a true polymorphic data structure, and very difficult to
        achieve without some sort of restriction.

          The most common restriction is to limit the element type to
        either a pointer, or at least interger sized. This is easy to do
        and both of the previous types can be converted to this form by
        using an element type of ``void *''.  This is a cop out.

          I have yet to see a truly polymorphic data structure in C,
        though below I have outlined a method which may produce such a
        structure.


 5/ How is insertion/retrival for keyed elements defined.

      Does the key need to be part of the stucture being inserted or is
    is a separate element?  What restriction is their of the type of a key
    element? Can the key type be itself a structure?  How is the key
    tested, and is the test for equality or relative?  Can you insert the
    same key multiple times?  If so in what order are they retrived  FIFO,
    FILO, UNDEFINED.


 6/ Can Data Stucture instance variable change value for the same structure?

      For example, in a typical stack implimentation, a stack instance is
    just a pointer to the top element on the stack. This would vary as
    elements are pushed on and poped off the stack. This could be a
    disaster if the user wants to reference the stack in other places such
    as another data structure, or pass the stack to other procedures.  A
    better idea is to return a pointer to a stack header instead of the
    top of stack.


 7/ How to handle error conditions?

      Return a pre-declared or user supplied `Null' type value or other
    error value?  Or just NULL for pointers?  Call a error handler?  Can
    the user define the error handler?  Can you continue after an error?
    Can the above items be different for different instances?


-------------------------------------------------------------------------------
Methods of Implementation

 1/ Restricated to pointers only (void *), with multiple instances

      This is simple and satisifys most of the problems above.  It also
    allows NULL to be used for error returns and works well in just about
    all possible cases. But it is not truly polymorphic and thus is a
    cop out.

    Problems:
      No type checking of data elements of the structure.
      How to access any keys stored within the elements.
      Comparisions are pointers to the pointer or the pointer itself?


 2/ Use defines to set the element type, and other required information
    to compile into generic procedures.

    Problems:
       Only one type of any data element can be defined in a program,
          unless the procedures can be re-labeled by another define.


 3/ Use Macros to write procedures and declarations for the
    specific element type wanted.

      The ``linked list merge sorting'' program mentioned above was of
    this type. The procedures names are modified to include a name
    provided by the macro caller, thus insuring the ability to create
    a set of procedures to handle any element type required.

      Although multiple element types and instances are catered for this
    method is highly wasteful of code generation, with multple copied of
    procedures with very little differances. This method however is
    perfect for cases where only one data type needs to be stored.

    Problem:
      Duplicate code for different element types.


 4/ Macros to make data structure declarations for a specific data type,
    but does not use any type defines to modify procedures to a specific
    type. The procedures themselves handle all instances of the structure.

    Example for a stack type:
        #include "stack.h"                 /* include this file */
        STKdecl(FooStack, Foo );
               /* Declare a stack of type FooStack for elements of type Foo */
        STKinit(stack, FooStack, NullValue);
               /* Create and Initialize `stack' as an instance of stack type
                   FooStack, declare any empty/error returns as NullValue */

      The major problem is how to copy the elements (char, int, pointer)
    into the data structure without knowing the actual type at that time.

      The solution for insertsion and retrival is to call the procedures
    to only find (and create?) the structure, but have macro calls do the
    handling of the users data into and out of the data structure.  The
    macros being expanded in the users code area, and will know and
    understand the type of the users data and so typechecking is performed
    there.  IE:  procedures only handle data structure nodes and macros
    handle the elements within the nodes.

      Comparision function calls make this problem even more difficult.
    These functions must be called within the data structures procedures
    (due to the need for looping), and can't thus know the data type of
    the functions arguments.

      To solve this problem you could either insist that you only allow
    pointers and integers to be stored (so why bother with all this?), or
    restrict the comparision function to compare pointers to the elements
    stored in the data structures nodes. The later, is the method that the
    C standard library functions qsort() and bsearch() so the method is
    well known.

      This method works well if you store actual C structures or intergers
    in the structure, but introduces a extra level of indirection if you
    store pointers to C structures in the nodes. IE, the comparision will
    be comparing a pointer to the pointer of the users data being
    compared. This last could also allow the user to change the actual
    value of the element stored in the data structure instead of only
    being given a copy of the element stored. But it will work fine.

    Problems:
      This method is easy for data structures that need no loops such as
        stacks and simple list structures (no sorting or search), but is
        harder to program for other data stuctures.
      The use of macros require care for side effect problems. This usually
        means that the structure header would usually need some temp
        variables included in them for the various macros to use.
      Comparision functions will be given pointers to the element stored
        in structure, instead of a copy of the element stored. IE:
        By-Referance instead of By-Value This could allow naive
        programmers to modify the element and thus destroying the data
        structures consistancy.

-------------------------------------------------------------------------------