------------------------------------------------------------------------------- Find Command For details of the order in which UNIX find traverses a directory tree and processes files, see the document "info/perl/dir_traversal_notes" It pays to things that ALL arguments have a "-a" (and) between them and to remember that "-a" binds tighter than "-o". And remember there is 'lots ways to skin a cat'. You may also like to look at "tree" (standard linux utility) which outputs a recursive directory listing. ------------------------------------------------------------------------------- General Find Functions directories() { find "$@" -type d -print; } # set directories allfiles() { find "$@" -type f -print; } # all plain files executables() { find "$@" -type f -perm -100 -print; } # executables only datafiles() { find "$@" -type f ! -perm -100 -print; } # non executables # Example usage d=public_html x=-r c= if [ -d "$d" ]; then directories "$d" | xargs $x chmod $c 755 # dirs - accessable datafiles "$d" | xargs $x chmod $c 644 # data - readable executables "$d" | xargs $x chmod $c 755 # exec - executable fi Find files and directories not globally readable find . \! -perm -444 -print ------------------------------------------------------------------------------- Top-level directory only Traditional method is to use the directory as an argument find * -prune .... That however ignores 'dot' files This includes them find ./* -prune .... Another way is to prune at the right level find . \! -name . -prune A modern way is to use the tree depth options (to do top-level only) find * -maxdepth 0 # does not include 'dot' files find . -maxdepth 1 # don't go deeper than one down find . -mindepth 1 -prune # go one level down then prune any deeper ------------------------------------------------------------------------------- Exclude a sub-directory find {path} {conditions to prune} -prune -o \ {your usual conditions} -print Normally you would leave out the "-print" (or "-print0") but you will need it when you use "-prune" or it will go wrong. Example... find . -name .snapshot -prune -o -name '*.foo' -print WARNING: This will also ignore ".snapshot" files/directorys found in sub-sub-directory, not just the top level ".snapshot" directory. You can also use "NOT -path" logic. BUT that will still recurse into that sub-directory, adn then ignore all the files it finds in that sub-directory. BUT That can take time if that sub-directory has a LOT of files. AND it will still report any access permissions in that sub-directory! find . \! -path ./.snapshot/* -name '*.foo' --- More exacting... Add "-path" (or the equivelent "-wholename") for the specific sub-directory. find . -path ./.snapshot -prune -o -name '*.foo' -print For non-GNU find (without "-path") you can use "-exec test" ... find . -name .snapshot -exec test '{}' = './.snapshot' \; -prune \ -o -name '*.foo' -print DO NOT ignore type of the filename being pruned! Note this only matters if second condition may want to deal with files of the same name as the directory to be excluded. I also recommend the use of parenthesis for clarity of the arguments, even though precedence would mean the same thing. find . \( -type d -path ./.snapshot -prune \) \ -o \( -type f -print \) --- Multiple Directory Exclude find . -type d \( -name media -o -name images -o -name backups \) -prune \ -o -print OR... find . -name media -prune \ -o -name images -prune \ -o -name backups -prune \ -o -print ------------------------------------------------------------------------------- Exclude a file suffix find . -type f \! -name '*.bz2' -print0 | xargs -0r bzip2 -v ------------------------------------------------------------------------------- Find and Delete broken symbolic links find -L /app/nagios_scripts -type l -delete This works as the '-L' causes find to TRY and follow symbolic links. Broken links will fail and as such the '-type l' will then be true. It will never be true if the symbolic link was followed as find will no longer be looking at a symbolic link. The exception is if a good symlink points to a broken symlink. The broken one will in that case get removed, leaving the good symlink broken. See also linux script 'symlinks', and my own script 'symlink' ------------------------------------------------------------------------------- Run a command in each directory find . -type d -print0 | xargs -0 -n1 {command} EG: find . -type d -print0 | xargs -0 -n1 echo This could be done in parellel too! --- Run a seperate find in parallel on each top level sub-directory, It is faster.. find . -mindepth 1 -type d -prune -print0 | xargs -0 -P0 -i find {} -name '*.dvi' -print --- Run a seperate find in every sub-directory find . -type d -print0 | xargs -0 -P0 -i find {} -maxdepth 1 -name '*.dvi' -print Can be slower than the above, due to process launching, but... you can have a list of excluded directories find . -type d -print | fgrep -x -v -f exclude_list | xargs -0 -n1 ls Note the -x to ensure it matches the whole line... Which isn't very convenient. Something better is needed ------------------------------------------------------------------------------- With parellel execution (recursive compressions) find . -type f \! -name '*.bz2' -print0 | xargs -0r -n1 -P3 bzip2 -v pkill -USR1 xargs # increase parallelism pkill -USR2 xargs # decrease parallelism or using GNU-parallel No longer recommended due to 'citation requirement' making it less portible. find . -type f \! -name '*.bz2' -print0 | parallel -q -0 -j3 bzip2 -v pkill -USR1 parallel # get parallel to list the current jobs running ------------------------------------------------------------------------------- Add a suffix to a filename This fails, as the argument is needed twice! find /path -type f -exec command {}.suffix \; Find only expands "{}" as a seperate space separated argment and doesn't recognise "{}.suffix" as a string substution. Using "-exec" option.. find /path -type f -exec sh -c 'command $1.suffix' -- {} \; WARNING: find -exec will pause while command executes, rather than continue to search for the next match (pipelined) Using "sed" piped into shell find /path -type d -print | sed 's:.*:command &.suffix:' | sh The "sed" solution is probably the most versatile. BUT it is dangerious when malicious or uncontrolled filenames are posible. Using "xargs"... find /path -type f -print0 | xargs -0r -I{} command {}.suffix This is better, though not as general as using a shell For example... # Bash file suffix replacement... for name in *.old; do mv -vn "$name" "${name%%.old}.new"; done Or you can use a special purpose command... # mv_perm - file renaming script find /path -type f -print0 | xargs -0r mv_perl 's/\.old$/.new/' ------------------------------------------------------------------------------- Find-Grep Grep will NOT output a filename if only one argument is provided. As such if you use "xargs" to run "grep" add a /dev/null to ensure two filenames are provided. find /path -name "*.txt" | xargs grep "string" /dev/null Gnu-grep can use a -H option to force filename output find /path -name "*.txt" | xargs grep -H "string" ------------------------------------------------------------------------------- Empty and Old Directories These rely on GNU-find's '-quit' quit on first match '-empty' empty directory # clean out any old files - may leave empty directories # # NOTE you can not do the same technique for directories, # as directories become become 'modified' when a file is deleted. # find . -type f -mtime +180 -print0 | xargs -0r rm # # Clean old directories as a whole... # # remove top-level directories with ALL files older than 6 months (180 days) for DIR in *; do if [[ $(find ${DIR} -type f -mtime -180 -print -quit) == "" ]]; then echo rm -r "${DIR}" fi done # # clean empty directories (if only files are removed) # # Just remove empty directories (depth first) find . -depth -type d -empty -printf "rmdir %p\n" # find top-level directories with no files in any sub-directory for DIR in *; do if [[ $(find ${DIR} -type f -print -quit) == "" ]]; then echo rm -r "${DIR}" fi done See also https://antofthy.gitlab.io/info/shell/file.txt "Is a directory empty?" ------------------------------------------------------------------------------- Xargs and Parallel Xargs primary goal was to group filenames into batches before giving them to a command. However GNU-xargs can now also do parallel processing (see above) However caution is needed to ensure correct quoting og the arguments. Typically this is done by replacing newlines with NULL's in the input find ... -print 0 | xargs -0r ... Parallel, is a drop in "xargs" replacement, but can do "find" itself or run a list of shell commands. It makes use of multi-processors to run the batched commands that "xargs" would normally generate, and then run them in parallel. It is also a perl script using only standard perl libraries, so no archeture specific binary is needed. find ... -print 0 | parallel -0q ... --- Poor mans "xargs" (to collect groups of filenames) using "fmt" ls | fmt |\ while read args; do grep "some words" $args done The 'fmt' does the collection of arguments into lines whcih are then read by the shell loop to run 'batched commands'. As with xargs it will have quoting problems. Also note that "xargs" can also be used to create a poorman's "fmt" command see "Word-Wrapping or Text Formatting" in "info/shell/general.txt" -------------------------------------------------------------------------------