------------------------------------------------------------------------------- YAML - "YAML Ain't Markup Language" or "Yet Another Markup Language" A format that is designed for human friendly data structures using line and whitespace delimiters. Avoiding the use of quotation marks, brackets, braces, and open/close-tags wherever posible, which can be hard for human eye to balance in nested hierarchies. It makes it good as a "configuration data language" The amount of indentation is unimportant as along as a it is the same for the whole element. Tabs are never allowed as part of the indentation. NOTE: YAML is not a markup language but a data serialization language. which is why "YAML Ain't Markup Language" is probably more accurate. This format is used by Docker, Puppet, and Ansible From "The YAML document from hell" https://ruudvanasseldonk.com/2023/01/11/the-yaml-document-from-hell For a data format, yaml is extremely complicated. It aims to be a human-friendly format, but in striving for that it introduces so much complexity, that I would argue it achieves the opposite result. Yaml is full of footguns and its friendliness is deceptive. Which shows many ways YAML goes unexpectantally wrong. ASIDE: IN many ways YAML and JSON are somewhat interchangable, and in fact YAML includes JSON hashs and arrays as part of the language. YAML Specification is COMPLEX - 10 chapter with numbers 4 levels deep. https://yaml.org/spec/1.2.2/ Errata Page https://yaml.org/spec/1.2/errata.html There is 9 ways (63+) to actually write a string! https://stackoverflow.com/questions/3790454 Always quote ALL string values... Alturnatives to Yaml for configuration: Toml -- simplier and more regimented python -- your can parse it, and then us json.dump to seperate teh data from your program. ------------------------------------------------------------------------------- Example... =======8<--------CUT HERE----------axes/crowbars permitted--------------- --- # The first line marks the start of a yaml file (for streaming purposes) # This is of course a comment, and can appear anywhere, to end of line receipt: Oz-Ware Purchase Invoice date: 2012-08-06 customer: given: Dorothy family: Gale items: - part_no: A4786 descrip: Water Bucket (Filled) price: 1.47 quantity: 4 - part_no: E1628 descrip: High Heeled "Ruby" Slippers size: 8 price: 100.27 quantity: 1 delivered: true payment has been made: false # ship-to is same as bill-to (using anchors and aliases) bill-to: &id001 street: | 123 Tornado Alley Suite 16 city: East Centerville state: KS ship-to: *id001 specialDelivery: > Follow the Yellow Brick Road to the Emerald City. Pay no attention to the man behind the curtain. # next line marks the end of a yaml file (for streaming purposes) # It is optional, but useful for yamllint which hates trailing blank lines. ... =======8<--------CUT HERE----------axes/crowbars permitted--------------- ------------------------------------------------------------------------------- Elements and structure... Important tips... * You MUST have a space after the colon of a lable, like this... label: value If you don't then this... label:value becomes equivelent to this... "label:value": null * yes -> true and no -> false If your really need the string "yes" (for example: sshd_config values) You MUST quote it.. That is 'yes' -> yes * null means no value at all. If treated as a string it is, '' as a number, 0 =======8<-------- # comment key: value another_key: Another value goes here. a_number_value: 100 scientific_notation: 1e+12 0.25: a floating-point key boolean: true null_value: null key with spaces: value empty_hash: {} empty_list: [] single quotes: 'Has only ''one'' escape pattern' double quotes: "have many: \", \0, \t, \u263A, \x0d\x0a == \r\n, escapes." 'Keys can be quoted': "Needed if you want a ':' in key name" block-literial: | This format preserves newline and continues until indent is removed. The data in the yaml is indented by the first line. block-literial: |2 The 2 is the amount the following block is indented within the yaml file, relative to the start of the 'b' in 'block-literial' above. Anything beyond this is preserved in the block literial itself. In this case we have 4 extra spaces of indent block-literial: > This one does not preserve newlines. So any newlines and word-wrapping is removed from the data. block-literial: >- Adding '-' to form '|-' or '>-' will also strip the newline from the final string block-literial: >+ Adding '+' to form '|+' or '>+' Will also include any extra blank lines that follow dictionary: key: value key: value key: value dictionary-in-disctionary: nested_map: deeper: went the nesting and deeper still: it went list: - item - item - item - - nested list - inside a sequence - - - triple nested list - with indicated collapsed set: ? item1 ? item2 ? item3 # Equivelent to {item1, item2, item3} # EG: hash keys with null values (see below) # This is rarely used # # YAML can include JSON equivelents... # inline-array: [ a, b, c, d, e ] inline-dict: { name: John, age: 53 } # Achours for repeated elements (see below) &anchour-label *referance-to-anchor --- # document separator ... # End of File =======8<-------- ------------------------------------------------------------------------------- Explicit Data Types a: 123 # an integer b: "123" # a string, disambiguated by quotes c: 123.0 # a float d: !!float 123 # explicit float, data type prefixed by (!!) e: !!str 123 # a string, disambiguated by explicit type f: !!str Yes # a string via explicit type g: Yes # a boolean True h: !!bool Yes # a boolean True i: Yes we have No bananas # a string, "Yes" and "No" disambiguated by context. # ISO-formated dates are also directly understood. datetime: 2001-12-15T02:59:43.1Z datetime_with_spaces: 2001-12-14 21:59:43.10 -5 date: 2002-12-14 picture: !!binary | R0lGODlhDAAMAIQAAP//9/X 17unp5WZmZgAAAOfn515eXv Pz7Y6OjuDg4J+fn5OTk6enp 56enmleECcgggoBADs=mZmE user-defined-type: !myClass { name: Joe, age: 15 } ------------------------------------------------------------------------------- Syntax Definition... A compact cheat sheet as well as a full specification are available at the official site. The following is a synopsis of the basic elements. * YAML streams are encoded using the set of printable Unicode characters, either in UTF-8 or UTF-16. For JSON compatibility, the UTF-32 encodings must also be supported in input. * Whitespace indentation is used to denote structure; however tab characters are never allowed as indentation. * Comments begin with the number sign (#), can start anywhere on a line and continue until the end of the line. Comments must be separated from other tokens by white space characters.[9] Unless they appear inside of a string, then they are number (#) sign literals. * List members are denoted by a leading hyphen (-) with one member per line, or enclosed in square brackets ([]) and separated by comma space (,). * Associative arrays are represented using the colon-space (": ") in the form "key: value", either one per line or enclosed in curly braces ({}) and separated by comma-space (", "). * An associative array key may be prefixed with a question mark (?) to allow for liberal multi-word keys to be represented unambiguously. * Strings (scalars) are ordinarily unquoted, but may be enclosed in double-quotes ("), or single-quotes ('). * Within double-quotes, special characters may be represented with C-style escape sequences starting with a backslash (\). According to the documentation the only octal escape supported is \0. * Block scalars are delimited with indentation to preserve (|) or fold (>) newlines. with optional modifiers ('+', '-') to denote how the final newline is to be handled. * Multiple documents within a single stream are separated by three hyphens (---). * Three periods (...) optionally end a document within a stream. * Repeated nodes are initially denoted by an ampersand (&) and thereafter referenced with an asterisk (*). * Nodes may be labeled with a type or tag using the exclamation point (!!) followed by a string, which can be expanded into a URI. * YAML documents in a stream may be preceded by directives composed of a percent sign (%) followed by a name and space delimited parameters. Two directives are defined in YAML 1.1: * The %YAML directive is used to identify the version of YAML in a given document. * The %TAG directive is used as a shortcut for URI prefixes. These shortcuts may then be used in node type tags. YAML requires that colons and commas used as list separators be followed by a space so that scalar values containing embedded punctuation (such as "5,280", times (such as "15:30"), or URLs (such a "http://www.wikipedia.org") can generally be represented without needing to be enclosed in quotes. Two additional sigil characters are reserved in YAML for possible future standardisation: the at sign (@) and accent grave (`). ------------------------------------------------------------------------------- YAML Anchors (Aliases) Anchors identified by & and aliases by * Example - &flag Apple - Beachball - Cartoon - Duckface - *flag This results in - Apple - Beachball - Cartoon - Duckface - Apple Another Example node_1: &some_keys key1: value1 key2: value2 node_2: <<: *some_keys key2: new_value The '<<' means the key-values from the alias should be merged into this mapping. So the keys 'key1' 'key2' are merged and 'key2' is then updated. You can reuse a alias template to make multiple aliases, reuse it again, or empty it, to remove it from final data structure aliases: - &list1 - a - b - &list2 - c - d aliases: - &merged - *list1 - *list2 aliases: list_A: *list1 list_B: *list2 list_C: *merged NOTE: &merged is a nested list of an list! Which may not work as you expect it to. There appears to be no way to 'flatten' the merged list. NOTE: You can merge ONE item into the final list, just not multiple! In docker you can use x- prefix for the template alias, which docker will ignore such prefixed items. ------------------------------------------------------------------------------- Implementations MOST lets you read specific things from a yaml file, BUT does not let you search the file for specific elements. Bash These tend to be overly simple, and fail in bad ways. # sed-awk yaml parser https://github.com/jasperes/bash-yaml/tree/master/script perl YAML.pm reads entire file in before parsing perl YAML::Tiny only reads the first document in the stream and stops yamllint.noarch reports yaml parse errors only python # pip install PyYAML # read yaml import yaml from pprint import pprint data = yaml.dump(var) pprint(yaml.load(data)) # write yaml import yaml # python -m pip install PyYAML d={ "List": { "title": "example glossary", "ID": "SGML", "SortAs": "SGML", "Array": ["GML", "XML"], } } f=open('output.yaml','w') f.write(yaml.dump(d)) f.close OR... import yaml FILENAME = 'inventory.yml' with open(FILENAME) as file: data = yaml.full_load(file) # data is a python dictionary from pprint import pprint pprint( data["all"]["children"]["app_seet"]["hosts"] ) # just the hosts, and not the values in group "ant_seet" for k,v in data["all"]["children"]["app_seet"]["hosts"].items(): print(k) # the inventory groups host is in host = "na-tst-ant.itc.griffith.edu.au" for k,v in data["all"]["children"].items(): if host in v["hosts"]: print(k) pip install niet python yaml/json extractor & converter https://github.com/openuado/niet Get contents of app_seet niet all.children.app_seet.hosts inventory.yml niet all.children.app_seet.hosts inventory.yml -f json # => the whole sub-structure in various ways # evaluated to shell variables in arrays... echo '{"foo-biz": "bar", "fizz": {"buzz": ["zero", "one"]}}' | niet -f eval . | sed 's/;/;\n/g' pip install yamlpath Can get elements, but can also SEARCH for element locations (paths)! yaml-get Using a 'yamlpath' retrieve info All app_ant_test hosts, and any data... yaml-get -p /all/children/app_ant_test/hosts inventory.yml All grp_s3 hosts yaml-get -p /all/children/grp_s3 inventory.yml | json_pp S3 DEV hosts... yaml-get -p /all/children/grp_s3 ~/ansible/inventory.yml | \ json_pp | cut -d\" -f2 | grep -- -dev- Union of "grp_s3" and "env_development" EG: get both groups and find hosts that are listed twice for grp in grp_s3 env_development do yaml-get -p /all/children/$grp/hosts inventory.yml | json_pp done | awk -F\" 'NF==3{ a[$2]++; if(a[$2]==2) print $2 }' Nathan hosts... yaml-get -p /all/children/nathan/hosts inventory.yml |\ json_pp | grep -oP '"\K[\w.-]*' yaml-paths Search yaml returning paths to the entry... NOTES: -K returns a key path, -L the matching leaf node value -t/ replaces '.' with '/' in paths generated) -F skips reporting the source file -s ^ is starting with, = for exact $ for end string WARNING: Ansible inventory only contains keys, no leaf nodes. Unless host vars are added (EG: in insights_inventory) Starting with... yaml-paths -K -s ^na-tst-ant inventory.yml Contains string... yaml-paths -K -t/ -s %ant inventory.yml Include data (in JSON) of the keynames found yaml-paths -KL -t/ -s %ant inventory.yml The groups (at specific level) the keyname is in... yaml-paths -FK -t/ -s %ant inventory.yml | cut -d/ -f4 All the paths to hostnames in the file yaml-paths -FK -t/ -s \$griffith.edu.au inventory.yml Pivot to list by host and there groups yaml-paths -FK -t/ -s \$griffith.edu.au inventory.yml Data of this exact keyname (group) yaml-paths -FKL -t/ -s =app_smtp_internal inventory.yml Elements with this exact leaf node value yaml-paths -FL -t/ -s =local inventory.yml yaml-diff how do files differ in a meaningful way yaml-merge Merge multiple yaml/json files together yaml-validate Validate Yaml or JSON files. YQ for yaml... https://github.com/mikefarah/yq This has some nice practical examples https://unix.stackexchange.com/questions/665242/ nushell https://www.nushell.sh/ A data processing shell for JSON, YAML, SQLite, Excel, etc Online parser https://yaml-online-parser.appspot.com/ Convert YAML to JSON NOTE: this seems to need three EOF's! You can extract this from json using 'jq' (see "json.txt" - yuck) yaml2json() { ruby -ryaml -rjson -e 'puts JSON.pretty_generate(YAML.load(ARGF))' $* } yaml2json data.json -------------------------------------------------------------------------------