------------------------------------------------------------------------------- Standard file/word parsing problems Shell Word Parsing abc 'xyz' "def" 'a\b"c"'\ xyz\ "1'2'3\\45" end-of\ -line CVS (comma separated, quoted strings) "a,b",c,"d,e,\"f\",h","i,j,k" Config files numerical: 99 boolean: True string: "See spot run." XML Containers... xyzzy ------------------------------------------------------------------------------- See man 3 wordexp Though this also performs filename globbing and comamnd substitution as well. As such it may be considered overkill for most situations. ------------------------------------------------------------------------------- CVS This is reasonably defined and the more common problem. And for simple uncomplicated input works well, but when you start getting quoted strings containg commas, and then escapes quotes it all quickly decends into madness! But before trying to parse it... Stop Rolling Your Own CSV Parser! http://secretgeek.net/csv_trouble.asp Psuedo code... Loop on the string letter by letter. If current_letter == quote : toggle inside_quote variable. Else if (current_letter ==comma and not inside_quote) : push current_word into array and clear current_word. Else append the current_letter to current_word When the loop is done push the current_word into array From Stack Overflow http://stackoverflow.com/questions/6209/ Also look at http://www.codeproject.com/KB/recipes/qstringparser_net.aspx ------------------------------------------------------------------------------- Config files. Like CVS it starts reasonably simply but then decends into chaos, as more types and specifications becomes involved The phpMyAdmin, and GDM config is probably one of the best examples. [main] numerical: 99 boolean: True [daemon] string: "See spot run." List: "Tom", "Dick", "Harry" Item[1]: Apples Item[2]: Oranges Item[3]: Pears ------------------------------------------------------------------------------- XML Originally developed from HTML parsing efforts. Extremely well defined, lots of libraries. Generates a heirachal tree than you can follow and read. But it is very verbose, and horrible to edit. XML C library: expat -------------------------------------------------------------------------------