-------------------------------------------------------------------------------
XML -- eXtensibe Markup Language
Essentually based on SGML on which HTML is based. As such it is very
commonly used on the WWW. Basically an extended and generalized form.
Because of this most browsers can directly pretty print XML data.
NOTE: Unlike HTML, all tags must be closed.
Example... Recent Documents List...
~/.local/share/recently-used.xbel
A DTD at the start (or included) gives a hint to how the data structure is
arranged before the data begins. (EG: syntax)
-----
Examples...
These two files are equivelent
=======8<--------
50
=======8<--------
=======8<--------
grep Gary R Epstein
stty Simon T Tyson
50
=======8<--------
White space is however preserved...
These are not the same
=======8<--------
Text
=======8<--------
=======8<--------CUT HERE----------
Text
=======8<--------CUT HERE----------
-------------------------------------------------------------------------------
Inclusion
The top level can include
xmlns:xi="http://www.w3.org/2001/XInclude"
Which specifies a scheme to allow XML files to include the contents from
other files... If the XML reader follows that syntax!
https://en.wikipedia.org/wiki/XInclude
=======8<--------
=======8<--------
Openbox is known to follow this syntax, not certain of other XML readers.
-------------------------------------------------------------------------------
Document Type Definitions (DTDs)
DTD can be declared externally...
=======8<--------
...
=======8<--------
Or internally
=======8<--------
]>
Everyday Italian
30.00
=======8<--------
-------------------------------------------------------------------------------
Shell
xmlstarlet Query XML documents
-------------------------------------------------------------------------------
Perl API
XML Project http://perl-xml.sourceforge.net/
Summery FAQ on different XML perl modules
XML::Parser
First parser for XML started by Larry Wall.
XML::Simple
Good for simpler forms of XML (depreciated)
XML::SAX
Simple API for XML
XML::DOM
Loads document completely into memory
XML::XSLT
Hard to use?
For a basic summary see articals
Pasring XML into a simple hash
http://www.perlmonks.org/?node_id=90287
Death to Dot Star! (using regexp with XML is not good)
http://www.perlmonks.org/?node_id=24640
-------------------------------------------------------------------------------
Perl XML::Simple
Read the XML into a single data structure, can use SAX to add handlers
Strict mode ensure things work as expected!
You can provide infomation on how the data is to be stored.
EG : Into arrays, hashed (indexed), or single values
use XML::Simple qw(:strict);
my $xml_data = XMLin("file.xml",
ForceArray => [ # force array even with one element
qw( HASH ), # these are array/hash elements
qr/_list$/, # you can use regexp to select
],
KeyAttr => { # Array to Hash, specifing the key
HASH => 'KEY',
},
GroupTags => { # simply a tags that only has one sub-tag type
TAG => 'SUBTAG', # ->{TAG}->{SUB} becomes just ->{TAG}
# SUBTAG may be a value, array, or hash
},
Handler => { # use SAX to process sections
ARRAY => &function,
}
);
use Data::Dumper;
print Dumper($xml_data);
-------------------------------------------------------------------------------
Perl XML::SAX Simple API for XML
Uses handlers and filters to avoid reading in LARGE data structures
-------------------------------------------------------------------------------
Perl XML::Twig
Efficentally process XML files.
You can process the file while it is still being read in.
http://www.xmltwig.org/
Process can use callbacks, or just process smaller parts of the data tree.
See tutorial for the various ways of process the tree.
Especially: 4.3 The flush and purge methods
-------------------------------------------------------------------------------