XML is not a programming language

XML is great for a lot of things and, yes, the L stands for language. The M is the key here though – markup. XML is a markup language, a way of representing data.

It’s a standard for representing data structures, and it does that job pretty well. It’s a bit verbose, in the way that the Pacific Ocean is a bit wet, but it does the job well.

What it isn’t is a programming language. Unfortunately, procedural programming occasionally gets shoehorned in, and I think that’s a bad idea.

Here’s an example of what XML is intended to do, ignoring all the schema and namespace stuff for clarity:


A simple representation of a name. Markup languages eliminate many of the issues of other data formats – they don’t rely on fixed length fields, or knowing which order the fields appear in. Each entry is clearly delimited by the markup – in this case the tag indicates that everything until the matching end tag of will be information about a person. Each field has a start and end tag, and if you use a schema you can even validate that you have the required fields.

So far so good.

Note that this could be written as:


The order of the first name and surname fields should be irrelevant. When programming gets shoehorned in to XML, this is no longer true.

I’ve been looking at ESBs – Enterprise Services Buses. I’ve already written a post on what they are and why I’m looking at implementing one. Most of them store their configurations in XML files, which seems perfectly reasonable except for one glaring issue. The configuration includes the logic of how to route messages – i.e. if this then that.

Here’s how the WSO2 ESB does it in XML:

  <filter source="get-property('To')" regex=".*/StockQuote.*">

Or an example from the Mule ESB documentation:

  <choice doc:name="Choice">
      <when expression="#[flowVars.language == 'french']">
          <set-payload value="Bonjour!" doc:name="Reply in French"/>
      <when expression="#[flowVars.language == 'spanish']">
          <set-payload value="Hola!" doc:name="Reply in Spanish"/>
      <otherwise >
          <set-variable variableName="language" value="English" doc:name="Set Language to English"/>
          <set-payload value="Hello!" doc:name="Reply in English"/>

In each example, a new programming language has been invented – replacing if/then/else with filter/then/else or choice/when/otherwise. The verbosity of XML gets in the way of readability – each tag has to have an end tag – and the order matters.

When the routing gets more complex, the ESB usually supports inline scripting in another language. If you want to perform complex conditional transformations of your incoming data before you pass it on to another system you usually end up having to use another language, inline with your XML and with a special tag to indicate that the data should be treated as a script. Here’s an example from WSO2:

  <script language="js"><![CDATA[
         var symbol = mc.getPayloadXML()..*::Code.toString();
            <m:getQuote xmlns:m="http://services.samples/xsd">

The <script language=“js”><! [CDATA[ … ]]></script> block indicates that anything contained within should be treated as a script, in this case Javascript.

Javascript is doing the heavy lifting, which is fair enough because it is a general purpose language. The downside is that it’s relatively hard to debug an inline script. Once operations get more complex, perhaps involving enriching a message with data retrieved from a second system, or interacting with a database, the XML method gets messy, complex and hard to manage.

I’m not singling out WSO2 here, by the way, it’s just that it happens to be the one I’m most familiar with. It’s actually a great choice if you want a traditional ESB, and one I’d happily recommend. It just suffers from the same limitations that seem to affect all of the XML based ESBs.

It seems to me that there should be a simpler way to do it. An ESB takes an incoming message, transforms it if necessary and passes it on to one or more other systems. Wouldn’t it be easier if all of that could be done in a proper programming language?

One approach would be to use CGI type scripting under Apache, assuming your messages are sent over HTTP. You could use the language of your choice to analyse the data coming in, determine the format, extract the fields and then process as required. If you use an API framework, such as Slim for PHP, some of the complexity is handled for you.

There’s still a lot of code to write to do anything useful and it’s hard to scale. You’ll end up reinventing a lot of wheels, and may eventually decide that a full-blown ESB would be a better option.

Wouldn’t it be nice then if there was an ESB that let you write your workflows in a simple scripting language but handled all of the hard work of managing incoming and outgoing messages? That let you use, say, Python to define your processes rather than XML? It could take incoming messages in JSON, XML or SOAP, convert them to a Python dictionary object automatically and then let you write the code to process them. Outgoing messages could also be built up as dictionary objects and automatically converted to the format required by the endpoint.

So much the better if it also had the features that make an ESB a better choice than DIY scripts – such as load balancing, rich statistics, credential management, an admin GUI and pre-defined integrations.

Luckily there is. And on that cliff hanger I’ll leave it for today…

in Random Musings


  1. David Worton 13th June 2017
  2. seanbAuthor 14th June 2017

Add a Comment

Your email address will not be published. All comments will be reviewed.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Related Posts