------------------------------------------------------------------------------- A Directory traversal is a very complex tree traversal! In a normal tree structure you have three basic ways of traversal... Depth First Top-Down Breath 7 1 1 / \ / \ / \ 3 6 2 5 2 3 / \ / \ / \ / \ / \ / \ 1 2 4 5 3 4 6 7 4 5 6 7 In a directory traversal, each node consists of a very long list of filenames of both directory and non-directory filenames, in that node (directory). This vastly complicates matters and the definition of exactly what order a directory tree is traversed. However you also have the 'order' of the sub-directories to consider, as many UNIX filesystems just returns the filenames in a specific directory in unsorted psuedo-random (stored list) order. As such within just one directory (node of the tree) the names could be processed in... unsorted "as found" order alphetically sorted (case sensitive or insensitive?) a user specified order. EG: certian files first (readme?), numerically, then anything else? Then you have the question on if you want to seperate sub-directory components from the list before or during processing (forming two seperate traversals of each directory node). So with the filename list do you... do each entry in the order given. do all file entries THEN do all directory entries do all directory entries THEN do all file entries None of this actually involves the traversal of the sub-directory only the handling of a single directory of sub-directory and filename entries. So when do you actually traverse the sub-directories? before/after processing ALL the filenames in the parent directory? only before/after looking at non-directory filenames? traverse a sub-directory immediately before/after that directory? As you can see directory traversal has a lot of posibilities. This is not just for file directories, but also other sorts of directories, such as LDAP directory tree of entries, or some other complex data structure (that may even contain loops). In summery... Any tree recursion has sort order of the directory entries process files and directory entries seperatally? and in what order when to traveral sub-directories (depth, top-down, breath) which may even be a different sort order! ------------------------------------------------------------------------------- Specific directory traversal programs... UNIX "find" command... Traverse in unsorted directory listing order (files/directory names mixed). By default traversal each sub-directory immediatally after processing the filename of the current directory. What this means is that when a directory is being processes only some of the non-directory files in the parent directory may have been processed, and what has or has not been processed depends on the randomised file creation order. The "-depth" changes sub-directory traversal from immediately before to immediately after the contents of that directory Perl File::Find Perl version 5.7 Optional sorting of the retrieved directory listing, Process all file filenames in a directory first Then process each sub-direcory (in reverse order) processing a dir filename immediately before traversing it. Version 5.8 File::Find is exactly as 5.7 but a patch I provided makes the directory handling in the correct order. The file - directory seperation is still present (no option), and in most cases desirable. The "bydepth" makes the directory filename processing AFTER that directory is processed. That is all it does, filenames are still processed in the same order as before, non-directory files first, then directory files. Perl File::Find::Parallel From cpan.. parallel directory searches ------------------------------------------------------------------------------- Anthony Thyssen 7 November 2003 -------------------------------------------------------------------------------