XML is a must-have skill for every professional: it is still broadly used by application protocols such as SOAP or SAML, or by description languages such as WSDL. By the way it is not something that is bound only to the application integration field: several applications relies on XML based configuration files and Blueprints, think for example to Apache Tomcat configuration files ("server.xml", "context.xml", ...), OSGI Blueprints such as a JAAS configuration, or again to Maven POM files or SCAP files (OVAL, XCCDF, ...).
The aim of this post is to provide a good overview of XML, explaining its goals, how to enforce it using schemas and generate a human-readable view of an XML file using XSLT.

The post also provides an overview of the XPath query language with some examples using the xpath command line tool. It shows you how to modify XML nodes and attributes using the xmlstarlet command line tool.

By the way, this post is part of a trilogy of posts dedicated to markup and serialization formats, so be sure not to miss

Of course these things would deserve a whole book, ... but as usual I try to be synthetic, showing you only the things that are really likely you'll sooner or later come across when working with XML, and giving tips on where to find more information on each topic.

What is XML

The eXtensible Markup Language (XML) is a markup language aimed at defining a set of rules for encoding documents both in a human-readable and machine-readable format.

Some of the most important principles of XML are:

  • it relies on a well-formed syntax (all tags should be closed) – note that special XML characters like the < and the > must be escaped
  • the document has a hierarchical structure that can be enforced by the use of a schema. The most commonly used schema languages are:
    • Document Type Definitions (DTDs): it uses a terse formal syntax that is mostly focused on what attributes an element has and on the nesting of the elements. It also allows you to declare very simple data types for the attributes.
    • XML Schema: it addresses the same needs of DTD, but it also lets you specify attributes validation constraints.
    • Relax NG: it is a sort of compromise between the previous two
    • Schematron: it is aimed at making assertions about the presence or absence of patterns in XML trees: you can use it in addition to other schema languages.
  • by using namespaces, it is possible to use different XML variants that refer to a different schema within the same document

A simple XML File

So to use a "by examples" approach, let's pretend that we are developing a system monitoring application that has to be configured using XML - let's claim it using a SCRUM-story like statement:

as a user, I want to be able to set alerts that run whatever command I want as checks and be able to configure notifications when needed using custom commands.

Since XML language is extensible by design, it is the perfect candidate to define a custom document that describes settings required by this use case.

We call this system management application fancymonitpor - let's create its home directory "/usr/local/fancymonitor" directory:

sudo mkdir -m 755 /usr/local/fancymonitor
sudo mkdir -m 755 /usr/local/fancymonitor/etc /usr/local/fancymonitor/schemas

now suppose that the fancymonitor application reads its settings from "/usr/local/fancymonitor/etc/alerts.xml" XML file: an example structure with a few settings can be as follows:

<?xml version="1.0" encoding="UTF-8"?>
<alerts>
  <alert severity="MAJOR" id="1">
    <description>Ensure that free disk space is at least 20%</description>
    <command mountpoint="/var/log" interpreter="/bin/bash" ok-when="0">
      [ $(df --output=size,avail ${MOUNTPOINT} | tail -n 1 | awk '{pfree=$2*100/$1; printf "%3.0f\n", pfree}') -gt 20 ] || exit 1
    </command>
    <notifies/>
  </alert>
  <alert severity="CRITICAL" id="2">
    <description>Ensure that free disk space is at least 10%</description>
    <command mountpoint="/var/log" interpreter="/bin/bash" ok-when="0">df -Ph ${MOUNTPOINT}
      [ $(df --output=size,avail ${MOUNTPOINT} | tail -n 1 | awk '{pfree=$2*100/$1; printf "%3.0f\n", pfree}') -gt 10 ] || exit 1
    </command>
    <notifies>
      <notify command="/usr/bin/mail" arguments="-s 'CRITICAL: Free space on /home is lower than 10%' sysadmins@carcano.local" retrigger-every="5">
        CRITICAL event - This is to notify that there's less than 10% of free space into /home mount point
      </notify>
    </notifies>
  </alert>
</alerts>

as you can see, this file begins with a tag that claims the XML version in use and the character set that required to parse the document:

<?xml version="1.0" encoding="UTF-8"?>

then there is a set of user defined tags ("alerts", "alert", "description", "commands", …) some of whom also have attributes – for example the "alert" tag has "severity" and "id" attributes.

The above files defines two alerts:

  • the first alert, with MAJOR severity, is raised when disk free space on "/var/log" mount point is less than 20%
  • the second alert, with CRITICAL severity, is raised when disk free space on "/var/log" mount point is less than 10%

both alerts provide the shell command to run to guess the current free space and exit (returning nothing) if it is above the threshold.

The second alert also provides the shell command to run to send an email notification.

The following tree highlights the structure of this sample XML document:

<alerts>
   |--<alert>
   |    |--<description>
   |    \---<command>
   |--<alert>
   |    |--<description>
   |    |--<command>
   |    ---<notifies>
   |           \----<notify>

as the document is structured so as to enable to:

  • declare as many alerts as we want
  • define notifications when needed

It definitively addresses our use case.

Some of you may certainly observe that this goal can easily be achieved also using more modern languages, such as YAML, ... but XML has a plus that many other languages don't have: it enables you to enforce its syntax by describing it using a schema.

XML is a huge topic that is not possible to thoroughly explain within a blog post: if you need further information you can read Structures and Datatypes recommendations.

Enforcing the XML syntax with a schema

XML goes far beyond simply providing a way to structure data: users are humans, and humans make mistakes. Users may make typos, or forget something and do a lot of other things that make the document unreadable by the parser.

XML addresses this by supporting  schema enforcement: the schema describes the syntax that can be used for a given XML document. The most straightforward benefit is that it enables the use of an agnostic linter that can be used to parse any XML document and warn about syntax errors.

XSD schema goes even beyond simply validating the grammar, the structure of a document and the object type of the attributes: since it let you describe also validation constraints for the values of the attributes, it enable to describe an interface that can also validate the provided values - this means that for example:

  • you can use it to render a form (either web or a desktop application) with not only text boxes with constraints on the input type, but also combo boxes pre-filled with only valid values - an example of this is the scap-workbench application, that is part of the OpenSCAP suite.
  • when developing a webservice, you can rely on it to validate the values of the attributes of the received input
  • you can use it to generate a WSDL

and so on.

Schema enforcement is broadly used by a lot of software: an example that certainly is worth the effort to mention is OpenSCAP. It is the open source implementation of the Security Content Automation Protocol: it is made of a scan engine that provides a lot of benchmarks that perform compliance check - and even automatic remediation of the wrong finds. These benchmarks are configured using XML documents (OVAL, XCCDF, DataStream, ...), and XML is also used to define the list of checks that OpenSCAP should do when scanning.

Let's install it and have a close look to it - enter the following command:

sudo dnf -y install openscap scap-workbench
OpenSCAP is a very nice piece of software to perform compliance checking with also automatic remediation of found issues too. Describing it in this post would be off-topic, but I encourage you to give it a try because it is really useful when dealing with keeping a lot of systems in compliance.

Besides the scanner itself, it provides a lot of XML based documents along with the schema files. Let's have a closer look at them.

DTD Schemas

this is the list of DTD schema files provided by the OpenSCAP RPM package:

  • /usr/share/openscap/schemas/xccdf/1.1/XMLSchema.dtd
  • /usr/share/openscap/schemas/xccdf/1.1/datatypes.dtd
  • /usr/share/openscap/schemas/xccdf/1.2/XMLSchema.dtd
  • /usr/share/openscap/schemas/xccdf/1.2/datatypes.dtd

as you can see all of them have names with a trailing ".dtd"

XML Schemas

this is the (partial) list of XML Schema files provided by the OpenSCAP RPM package:

  • /usr/share/openscap/schemas/oval/5.9/aix-definitions-schema.xsd
  • /usr/share/openscap/schemas/oval/5.9/aix-system-characteristics-schema.xsd
  • /usr/share/openscap/schemas/oval/5.9/apache-definitions-schema.xsd

listing all of them does not really matter: just note that all of them have names ending by ".xsd"

Schematrons

this is the list of schematrons schema files provided by the OpenSCAP RPM package:

  • /usr/share/openscap/schemas/oval/5.9/oval-definitions-schematron.xsl
  • /usr/share/openscap/schemas/oval/5.9/oval-directives-schematron.xsl
  • /usr/share/openscap/schemas/oval/5.9/oval-results-schematron.xsl
  • /usr/share/openscap/schemas/oval/5.9/oval-system-characteristics-schematron.xsl
  • /usr/share/openscap/schemas/oval/5.9/oval-variables-schematron.xsl

as you can see, all of them have names with a trailing "-schematron.xsl"

I'm omitting a lot of files: the very most of the RPM contents are XML documents indeed.

I'm providing you directions on where to find these kind of schema files to give you the opportunity to have a closer look to them by yourself: there's not enough room here to cover this huge topic, but if you are interested you can do it by yourself now that you know where you can find them, and of course you can check how are they coded to get some hints.

Developing a DTD file

This is the most basic schema enforcement mechanism provided by XML (it is using a subset of the SGML DTD). It is mostly focused onto:

  • what attributes can an element have
  • the nesting of the elements
  • declare very simple data types for the attributes

Let's create the DTD that describes the schema supported by the "alerts.xml" document we previously saw; create "/usr/local/fancymonitor/schemas/alerts.dtd" file with the following contents:

<!ELEMENT alerts (alert+) >
<!ELEMENT alert (description, command, notifies) >
<!ATTLIST alert severity CDATA #REQUIRED >
<!ATTLIST alert id CDATA #REQUIRED >
<!ELEMENT description (#PCDATA) >
<!ELEMENT command (#PCDATA) >
<!ATTLIST command mountpoint CDATA #REQUIRED >
<!ATTLIST command interpreter CDATA #REQUIRED >
<!ATTLIST command ok-when CDATA #REQUIRED >
<!ELEMENT notifies (notify?) >
<!ELEMENT notify (#PCDATA) >
<!ATTLIST notify command CDATA #REQUIRED >
<!ATTLIST notify arguments CDATA #REQUIRED >
<!ATTLIST notify retrigger-every CDATA #REQUIRED >

As you can see, the syntax is really easy: we just declare elements (ELEMENT) and the list of attributes (ATTLIST) an element  has, taking care of describing the nesting of the elements themselves:

<!ELEMENT notifies (notify?) >
<!ELEMENT notify (#PCDATA) >

the above snippet claims that the element "notifies" may have the element "notify" as children.
We must now add the reference to this DTD file to "alerts.xml" document , so that an XML parser can know that there's a schema it can use to validate the syntax:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE alerts SYSTEM "file:///usr/local/fancymonitor/schemas/alerts.dtd">
<alerts>
… 

We are ready to validate the XML document using an agnostic linter: we can use xmllint, an utility that has been specifically developed to lint XML files:

xmllint --noout --dtdvalid /usr/local/fancymonitor/schemas/alerts.dtd /usr/local/fancymonitor/etc/alerts.xml

no news means good news.

Let's re-try this time with an not compliant syntax: let's modify the XML file, for example by adding a "foo” tag somewhere – in this example I added it in the end of the document:

  </alert>
  <foo />
</alerts>

Despite the syntax is well-formed, and so the document is still a valid XML, this time the linter returns the following error:

/usr/local/fancymonitor/etc/alerts.xml:3: element alerts: validity error : Element alerts content does not follow the DTD, expecting (alert)+, got (alert alert foo )
/usr/local/fancymonitor/etc/alerts.xml:22: element foo: validity error : No declaration for element foo
Document /usr/local/fancymonitor/etc/alerts.xml does not validate against /usr/local/fancymonitor/schemas/alerts.dtd

Developing a XSD file

One of the features that DTD is missing is defining constraints: the XML Schema Document (XSD) fills this gap and provides also a more granular control on the format of the document..

To see also XSD in action, we can rewrite the DTD we just did as an XSD.

Let's create "/usr/local/fancymonitor/schemas/alerts.xsd" file with the following contents:

<?xml version="1.0" encoding="UTF-8" ?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified" targetNamespace="https://www.carcano.local" xmlns="https://www.carcano.local">
  <xsd:element name="alerts">
    <xsd:complexType>
      <xsd:sequence>
        <xsd:element name="alert" maxOccurs="unbounded">
          <xsd:complexType>
            <xsd:sequence>
              <xsd:element name="description" type="xsd:string"/>
              <xsd:element name="command">
	      <xsd:complexType mixed="true">
		<xsd:sequence />
                <xsd:attribute name="mountpoint" type="xsd:string"/>
                <xsd:attribute name="interpreter" type="xsd:string"/>
                <xsd:attribute name="ok-when" type="xsd:string"/>
	      </xsd:complexType>
              </xsd:element>
	      <xsd:element name="notifies">
	      <xsd:complexType>
                <xsd:sequence>
                  <xsd:element name="notify" maxOccurs="3" minOccurs="0">
	          <xsd:complexType mixed="true">
                    <xsd:attribute name="command" type="xsd:string"/>
                    <xsd:attribute name="arguments" type="xsd:string"/>
                    <xsd:attribute name="retrigger-every" type="xsd:integer"/>
	          </xsd:complexType>
                 </xsd:element>
	        </xsd:sequence>
   	      </xsd:complexType>
              </xsd:element>
            </xsd:sequence>
            <xsd:attribute name="severity" type="xsd:string"/>
            <xsd:attribute name="id" type="xsd:integer"/>
          </xsd:complexType>
        </xsd:element>
      </xsd:sequence>
    </xsd:complexType>
  </xsd:element>
</xsd:schema>

As we already did with the DTD, we must link the "alerts.xml" file to this XSD schema, so that linters know that they can use it to validate its syntax - simply replace the beginning of "alerts.xml" file, removing the DOCTYPE line and add some attributes to the "alerts" tag as follows:

<?xml version="1.0" encoding="UTF-8"?>
<alerts xmlns="https://www.carcano.local" 
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="https://www.carcano.local file:///usr/local/fancymonitor/schemas/alerts.xsd">
  <alert severity="MAJOR" id="1">

as you see these attributes tell where to find the XSD schema for the "alerts" document type.

We are eventually ready to validate it again - type the following command:

xmllint --noout --schema /usr/local/fancymonitor/schemas/alerts.xsd  /usr/local/fancymonitor/etc/alerts.xml

as expected we fail the syntax validation - the output indeed is:

/tmp/fancymonitor/etc/alerts.xml:21: element foo: Schemas validity error : Element '{https://www.carcano.local}foo': This element is not expected. Expected is ( {https://www.carcano.local}alert ).
/tmp/fancymonitor/etc/alerts.xml fails to validate

this is because when testing DTD we added the unexpected tag "foo" to see what happens with a not compliant syntax: remove it an revalidate, and we get the following output:

/tmp/foo.xml validates

XPATH

The XML Path Language (XPATH) is a query language aimed at selecting various nodes in XML documents. It is used in various languages and tools such as XSLT, Xquery and so on and it is mainly employed in Software Development and Software Testing fields. However, learning at least its basics is worth the effort even for system administrators and system engineers: sometimes it happens to have to deal with XML files - knowing a little of XPath can be a valuable plus in some situations.

XPath selects the contents using nodes as the coordinates: the most used kind of nodes are:

  • Elements: these nodes are children of the root node and may contain attributes: they actually are the XML tags. The topmost Element is the Root Node and is denoted by the "/’" character
  • Attributes: these nodes are the property/attribute that element nodes may have, so by the XPath perspective Element nodes are the parent of these kinds of nodes: XPath denotes them prepending the "@” character.

The node identified by an XPath becomes the Context Node: once selected you can refer to it simply by using the dot (.) character.

From the Context Node there are some axis:

  • Parent axis it selects the parent node and it is identified by the double dot (..)
  • Child axis it selects all the children and it is identified by the name of the node followed by a double colon (::). For example "alerts::"
  • Attribute axis it selects attributes of the context node and it can be either be expressed by the attribute::name or @name notation

XPath let you express two kind of paths:

Absolute

it starts with the root node or with ‘/’ . For example, to select the alert element with id attribute equal to 2:

/alerts/alert/@id="2"

Relative

it starts from the selected element and descends into its children. For example (using the path of previous example), once selected the "alert" node with 2 as "id" attribute, we can use the relative path to get the value of "comment" node as follows:

//comment

you can get the value of an attribute prepending the @ character to the name of the attribute you want to get the value. For example - if current node is one of the "alert" nodes, we can get the value of the severity attribute as follows:

@severity

XPath lets you express predicates: these are used as filters that restrict the nodes selected by the XPath expression. The predicate syntax requires the expression to be enclosed by square brackets []. For example:

[alert@id="2"]

XPath has the following wildcards:

  • * it matches everything of the context node (text, comments, processing instructions and attributes node)
  • @* it matches all the attribute nodes of the context node
  • Node() It matches all the nodes of the context node – conversely from the star wildcard (*) it includes namespaces too

Note that XPath uses data-types:

  • Number they are only floating-point (IEEE 754) numbers, since integers are not required to be defined as number
  • Boolean either true or false
  • String a string of zero or more characters
  • Node-set a set of zero or more nodes

Working with xpath command line utility

We can now play with the xpath command line tool to practice with XPath a little bit: it is provided by the perl-XML-XPath RPM package. We can install it as follows:

sudo dnf install -y perl-XML-XPath

as an example, let's type a command to return all the "alert” nodes:

xpath -e "/alerts/alert" /usr/local/fancymonitor/etc/alerts.xml

the output is:

Found 2 nodes in /usr/local/fancymonitor/etc/alerts.xml:
-- NODE --
<alert severity="MAJOR" id="1">
    <description>Ensure that free disk space is at least 20%</description>
    <command mountpoint="/var/log" interpreter="/bin/bash" ok-when="0">
      [ $(df --output=size,avail ${MOUNTPOINT} | tail -n 1 | awk '{pfree=$2*100/$1; printf "%3.0f\n", pfree}') -gt 20 ] || exit 1
    </command>
    <notifies />
  </alert>
-- NODE --
<alert severity="CRITICAL" id="2">
    <description>Ensure that free disk space is at least 10%</description>
    <command mountpoint="/var/log" interpreter="/bin/bash" ok-when="0">df -Ph ${MOUNTPOINT}
      [ $(df --output=size,avail ${MOUNTPOINT} | tail -n 1 | awk '{pfree=$2*100/$1; printf "%3.0f\n", pfree}') -gt 10 ] || exit 1
    </command>
    <notifies>
      <notify command="/usr/bin/mail" arguments="-s 'CRITICAL: Free space on /home is lower than 10%' sysadmins@carcano.local" retrigger-every="5">
        CRITICAL event - This is to notify that there's less than 10% of free space into /home mount point
      </notify>
    </notifies>
  </alert>

as expected it returns two nodes.
Let's see a less trivial example, using a predicate to focus on the "alert" node with id attribute equal to "2" and get the "description" node:

xpath -e "/alerts/alert[@id='2']/description" /usr/local/fancymonitor/etc/alerts.xml

it produces the following output:

Found 1 nodes in alerts.xml:
-- NODE --
<description>Ensure that free disk space is at least 10%</description>

The xpath command line tool is very handy when it comes to write SHELL scripts that need to get values from XML files: what it is really missing is being able to set values to nodes.

Anyway, remember that it is better to consider using languages that natively parses XML, such as python, or even the good old perl, rather than using xpath command line utility: each time you run it you are opening a subshell indeed.

This is to say that the xpath command line tool can be a handy workaround to quickly add features to already existent scripts that may not be worth the effort to rewrite using another language.

Working with xmlstarlet command line utility

If you do have the need to set the values of nodes, consider using XMLStarlet. It is distributed into EPEL repository - you must install the RPM that enables the EPEL repository too:

sudo dnf install -y epel-release
sudo dnf install -y xmlstarlet
Be wary that XMLStarlet is very picky about the default namespace: if the document sets it, we should declare it in XMLStarlet too.

look at the following row of our document:

<alerts xmlns="https://www.carcano.local" 
 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
 xsi:schemaLocation="https://www.carcano.local file:///usr/local/fancymonitor/schemas/alerts.xsd">

since it contains the attribute "xmlns=”, it is setting "https://www.carcano.local" as the default namespace for the document: this requires us to declare the default namespace also when running xmlstarlet - this is accomplished by specifying the following option:

-N "c=https://www.carcano.local"

that means use "c” as a placeholder to "https://www.carcano.local” namespace. The consequence is that we are required to also add the "c:” prefix to every element of the XPath.

So, the xpath command:

xpath -e "/alerts/alert[@id='2']/description" /usr/local/fancymonitor/etc/alerts.xml

can be rewritten with xmlstarlet as follows:

xmlstarlet sel -N "c=https://www.carcano.local" \
-t -v "/c:alerts/c:alert[@id='2']/c:description" \
/usr/local/fancymonitor/etc/alerts.xml
  • sel is the xmlstarlet option to "select”
  • -t sets the XPath template to be used
  • -v requires to print the value retrieved using the XPath template

Since we are trying xmlstarlet because it enables to modify XML, let's set the description into "Ensure that free disk space is at least 10% and notifies via email when the event is hit”:

xmlstarlet ed --inplace -N "c=https://www.carcano.local" \
-u "/c:alerts/c:alert[@id='2']/c:description" \
-v "Ensure that free disk space is at least 10% and notifies vie email when the event is hit" \
/usr/local/fancymonitor/etc/alerts.xml
  • ed is the xmlstarlet option to "edit”
  • -u sets the node to be updated
  • -v the new value for the node

Let's check if it worked:

xpath -e "/alerts/alert[@id='2']/description" /tmp/fancymonitor/etc/alerts.xml

as expected, the output is

Found 1 nodes in /tmp/fancymonitor/etc/alerts.xml:
-- NODE --
<description>Ensure that free disk space is at least 10% and notifies vie email when the event is hit</description>

XSLT

XSLT (Extensible Stylesheet Language Transformations) is a language for transforming XML documents converting data from one XML schema to another.

For this reason it is suitable to:

  • Formatting - converting XML into HTML
  • Data exchange - converting data from one XML schema to another or into a data exchange format such as SOAP

Besides some built-in functions, it exploits the XPath query language: this means that you can use XPath to focus either on the whole XML document or onto a subset of it.

The conversion requires the creation of an XSLT Template: this file, also called XSLT stylesheet, is written in Extensible Stylesheet Language (XSL), a templating language. Besides XSL statements, this file may also contain additional information such as file headers and instruction blocks that will be added to the output file.

As an example, let's see how to display a human-readable version of the "alerts.xml" file, exploiting the transform to HTML provided by XSLT.

Create "/usr/local/fancymonitor/etc/alerts.xsl" file with the following contents:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:c="https://www.carcano.local">
<xsl:output method="html" version="4.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/">
<html>
    <head>
        <style>
            #alerts table 
            {
                border-spacing: 0px;
                border-collapse: separate;
            }
            #alerts td 
            {
                text-align: left; 
                vertical-align: middle;
                padding: 5px;
            }
        </style>
    </head>
    <body>
    <h2>Alerts List</h2>
        <table id="alerts" border="1">
            <tr bgcolor="#ff6600">
                <th>Alerts</th>
            </tr>
            <tr>
                <td>
                    <xsl:apply-templates select="c:alerts"/>
                </td>
            </tr>
        </table>
    </body>
</html>
</xsl:template>
<xsl:template match="c:notifies">
    <tr>
        <th style="border-bottom-style: solid;border-bottom: thin short;">Notifications</th>
        <td colspan="7" style="border-bottom-style: solid;border-bottom: thin short;">
            <table border="0">
                <xsl:for-each select="c:notify">
                <tr>
                    <th>Command:</th>
                    <td><xsl:value-of select="@command"/></td>
                    <th>Arguments:</th>
                    <td><xsl:value-of select="@arguments"/></td>
                    <th>Retrigger every seconds:</th>
                    <td><xsl:value-of select="@retrigger-every"/></td>
                </tr>
                <tr>
                    <th>Message:</th>
                    <td colspan="5"><xsl:value-of select="."/></td>
                </tr>
                </xsl:for-each>
            </table>
        </td>
    </tr> 
</xsl:template>
<xsl:template match="c:command">
    <tr>
        <th>Command:</th>
        <td colspan="7"><xsl:value-of select="."/></td>
    </tr> 
    <tr>
        <th>Options:</th>
        <th>Mountpoint:</th>
        <td><xsl:value-of select="@mountpoint"/></td>
        <th>Interpreter:</th>
        <td><xsl:value-of select="@interpreter"/></td>
        <th>OK When:</th>
        <td><xsl:value-of select="@ok-when"/></td>
    </tr> 
</xsl:template>
<xsl:template match="c:alerts">
    <table border="0">
        <tr>
            <th style="border-bottom-style: solid;border-bottom: thin short;">Id</th>
            <th style="border-bottom-style: solid;border-bottom: thin short;">Severity</th>
            <th colspan="7" style="border-bottom-style: solid;border-bottom: thin short;">Description</th>
        </tr> 
        <xsl:for-each select="c:alert">
        <tr>
            <td rowspan="4" style="border-bottom-style: solid;border-bottom: thin short;"><xsl:value-of select="@id"/></td>
            <td rowspan="4" style="border-bottom-style: solid;border-bottom: thin short;"><xsl:value-of select="@severity"/></td>
            <td colspan="7"><xsl:value-of select="c:description"/></td>
        </tr> 
        <xsl:apply-templates select="c:command"/>
        <xsl:apply-templates select="c:notifies"/>
        </xsl:for-each>
    </table>
</xsl:template>
</xsl:stylesheet>

note that:

  • we declared "c" as placeholder of "https://www.carcano.local" xml namespace
  • we iterate using XPath statements explicitly specifying the "c" placeholder, for example:

XSL stylesheets usually contain one or more template elements, which describe what action to take when a certain XML element or query is found in the XML file.

The above file contains three templates:

  • c:alerts (lines 74-91)
  • c:command (lines 59-73)
  • c:notifies (lines 36-58)

Their purpose is to specify a template of HTML tags and XPath selects: these latter are rendered fetching the values from the XML document.

These templates are then instantiated inside other templates. For example, "c:alerts" template is instantiated beneath the "/" template at line 29, "c:command" template is instantiated beneath the "c:alerts" template at line 87 and "c:notifies" template is instantiated beneath the "c:alerts" template at line 88 .

Please note that templates can be listed in any order.

Note how inside templates we make use of XSLT statements such as the for-each, used to iterate over XML nodes. Look for the example at the following snippet, that is cut from "alerts.xsl" (line 81 and line 89)

<xsl:for-each select="c:alert">
<tr>
…
</tr>
</xsl:for-each>

note also how we render the values fetched using the XPath select using "<xsl:value-of select="..."/>" statements, for example:

<xsl:value-of select="@id"/>

fetches the value of the "id" attribute of the current node.

Same as before, we must link the XML file to its XSLT file: add the following statement to beginning of the XML file, right before the alerts tag that begins the alerts XML document:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="alerts.xsl"?>
<alerts xmlns="https://www.carcano.local" … >
…

We are ready to have a go, generating "/tmp/alerts.html" HTML document by using xsltproc command line utility: just type the following command:

xsltproc -o /tmp/output.html /usr/local/fancymonitor/etc/alerts.xsl /usr/local/fancymonitor/etc/alerts.xml

have a look at the generated HTML file, just to see how this magic really did the trick.

Please note that most of the browsers are capable of processing XSLT on the fly: this means that we can even open "alerts.xml" in our favorite browser and get it directly rendered as an HTML, … but first, there's a notice you must be aware about:

Most of the web browsers consider opening XSLT stored into the local filesystem risky: for example Firefox raises the "CORS request not http” error. You can avoid this by opening the "about:config” URL, then lookup for "security.fileuri.strict_origin_policy” and set it to false.

I want to share with you a trick that spares you from disabling the security policy described in the above box - change to the directory where both "alerts.xml" and "alerts.xsl" files are stored - simply fire-up a temporary HTTP server and load the XML document through it..

We can achieve this simpy by firing-up the HTTP server provided by Python - first we need to change to the directory where the xsl and xsd files are stored:

cd /usr/local/fancymonitor/etc

if you are using python3 type the following command:

python3 -m http.server 8080

otherwise, if you are using python2:

python -m SimpleHTTPServer 8080

Now open your favorite web browser and open the URL that can be used to reach the Python web server: in this example I launched the web browser on the same host where it's running the Python server, so I can connect to the loopback IP address (127.0.0.1) to the port the temporary HTTP server is listening on (8080):

http://127.0.0.1:8080/alerts.xml

The rendered output of your browser should looks like as follows:

that is far more human-readable than simply displaying the XML document by itself, isn't it?

Footnotes

Here it ends our quick tour of the amazing world of XML: I tried to summarize all of the things that you are likely to have to do sooner or later. I couldn't elaborate it to the extent it actually deserves, otherwise it won't fit the size of a post. Anyway we at had the gist of things: now you know the potential of XML, when it is convenient to use it, how to query and modify XML documents, as well as how to pretty-print them using XSLT.

By the way, this post is part of a trilogy of posts dedicated to markup and serialization formats, so be sure not to miss

Writing a post like this takes a lot of hours. I'm doing it for the only pleasure of sharing knowledge and thoughts, but all of this does not come for free: it is a time consuming volunteering task. This blog is not affiliated to anybody, does not show advertisements nor sells data of visitors. The only goal of this blog is to make ideas flow. So please, if you liked this post, spend a little of your time to share it on Linkedin or Twitter using the buttons below: seeing that posts are actually read is the only way I have to understand if I'm really sharing thoughts or if I'm just wasting time and I'd better give up.

3 thoughts on “XML in a nutshell

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>