Applications

RightFields 1.21

Spam Prevention

TinyTuring 1.02

Template Tags

CheckLinks 1.2

Collect 1.2

Columnize 1.11

Compare 1.1

DateTags 2.3

DaylightOrStandard 1.1

DropCap 1.1

FilterCategories 1.1

FirstNWords 1.3

GetXML 1.1

Glue 1.1

IfModified 1.4

Loop 1.1

TextWrap 1.1



Support staggernation.com's plugin development

GetXML Plugin for Movable Type

Current version: 1.1 (7/12/04)

About the GetXML Plugin

This Movable Type plugin implements a set of template tags for retrieving data in XML format and displaying the data on your MT-generated pages. It's basically a Movable Type interface to the extremely handy XML::Simple Perl module. The plugin will work with any well-formed XML document that can be retrieved via an HTTP GET request.

The GetXML plugin is extremely generalized, and to use it you'll have to understand the structure of the XML data you want to work with. There are specific MT plugins available for certain particular types of XML data, such as RSS feeds, Amazon content, and weather forecasts. If there's a specific plugin out there for the XML resource you're interested in, by all means use it instead. GetXML is intended for XML data for which no plugin exists yet, or whose structure is simple enough that it might not merit its own plugin.

Requirements

To cache XML data, the GetXML plugin uses a feature of Movable Type (MT::PluginData) that was introduced in MT version 2.6. If you're using an earlier version, the plugin will work, but will not do any caching.

The GetXML plugin uses the following Perl modules:

  • LWP::Simple (this module, in turn, requires several other modules, which are installed as part of the LWP package)
  • XML::Simple
  • XML::Parser (used by XML::Simple)
  • Storable (optional; used for caching; if Storable is not installed, GetXML will work but will not do any caching)
  • Data::Dumper (optional; only needed if you want to use the MTGetXMLDump tag)

Installation

To install the GetXML plugin, upload the file GetXML.pl to the plugins directory within your Movable Type directory. If you do not already have a plugins directory, create one before uploading the file. For more information about Movable Type plugins, see the documentation.

Support

Please use the support forums for all support requests, bug reports, feature requests, questions, and comments regarding this plugin.

Accessing XML Data

An XML document consists of tagged elements. An element can contain either a piece of text (a simple element), or it can contain other elements and/or attributes (a complex element).

The MTGetXMLValue tag displays the text of a simple element. The MTGetXMLElement container tag lets you access the elements contained within a complex element. Within an MTGetXMLElement container, the tags that take an element name—MTGetXMLElement, MTGetXMLValue, MTIfElementExists, MTIfXMLElementNotExists, and MTGetXMLElementCount—refer to the elements nested within the element. You can nest MTGetXMLElement tags as deeply as necessary.

If a tag contains a text value and also contains other elements or attributes, the text value will be treated as a nested element with the name content.

An element may have "sibling" elements with the same name, within the same contaning element. If this is the case, MTGetXMLElement will loop through all the sibling elements, displaying its contents once for each.

Let's look at an example XML document:

<?xml version="1.0"?>
<sportspicks xmlns="http://example.com/sports.dtd">
 <title>Bob's Daily Picks</title>
 <game sport="basketball">
  <hometeam>Knicks</hometeam>
  <awayteam>Celtics</awayteam>
  <spread source="vegas">Knicks by 6</spread>
  <pick>Knicks</pick>
 </game>
 <game sport="basketball">
  <hometeam>Pistons</hometeam>
  <awayteam>Spurs</awayteam>
  <spread source="vegas">Pistons by 10</spread>
  <pick>Pistons</pick>
 </game>
</sportspicks>

Here's the MT template code you might use to display this data with the GetXML plugin:

<MTGetXML location="http://example.com/sportspicks.xml">
 <p><$MTGetXMLValue name="title"$></p>
 <MTGetXMLElement name="game">
  <$MTGetXMLValue name="awayteam"$> at <$MTGetXMLValue name="hometeam"$> 
  <MTGetXMLElement name="spread">
   (spread: <$MTGetXMLValue name="content"$>)<br> 
  </MTGetXMLElement>
  Bob's pick: <$MTGetXMLValue name="pick"$>
 </MTGetXMLElement>
</MTGetXML>

A couple of things to note about this example:

  • There's no need to use an outer <MTGetXMLElement name="sportspicks"> container, because by default the plugin omits the XML document's root element. If you do want to include the root element, pass keeproot="1" to MTGetXML.
  • Because the spread tag has an attribute, the element is treated as a complex element, so the tag's value has to be accessed as if it were a nested element called content.

There are two special ways to access a set of sibling simple elements.

  • If you're looping through a set of simple elements, you can use MTGetXMLValue without passing an element name:
    <ingredient>2 cans tomatoes</ingredient>
    <ingredient>1/3 cup sliced olives</ingredient>
    <ingredient>6 oz. feta cheese</ingredient>
    <ingredient>2 Tbsp olive oil</ingredient>
    <MTGetXMLElement name="ingredient">
     <$MTGetXMLValue$><br>
    </MTGetXMLElement>
  • If a set of sibling elements consists of only simple elements, you can use MTGetXMLValue directly to display the values of all the sibling elements, strung together. This is useful in cases where an element's content is interspersed with other elements, in which case you end up with the text broken up into multiple content elements. (This is true in the Bible example below.)

MTGetXML

This container tag is the master tag that tells the plugin to retrieve and interpret an XML document. All the other tags must go within MTGetXML.

You can specify the location of the XML document in one of two ways. You can simply put the entire location into the location attribute, or you can pass only the base URL in location and also pass any number of other attributes that will be used as arguments to construct a query string. For example, given this code:

<MTGetXML location="http://machine.somewhere.com/services/funxml.cgi" year="1973" zip="11201">
..
</MTGetXML>

the plugin will attempt to load the following URL:

http://machine.somewhere.com/services/funxml.cgi?year=1973&zip=11201

Both the location attribute and the argument attributes can contain MT template code, with square brackets instead of angle brackets and single quotes instead of double quotes. (To include a literal square bracket or single quote, escape it by preceding it with a backslash.) Any MT template tags within these attributes will be evaluated and the resulting text will be used in the location.

The tag takes the following attributes:

  • location="url" (required)
    The full location of the XML document, or the base URL to which the argument attributes will be added.
  • [argument]="value" (optional)
    An argument to be added to the base location URL.
  • cache="N" (optional)
    To avoid retrieving the same XML repeatedly in rapid succession, the plugin caches each retrieved document in MT's database. By default, the cached copy will be used if the template is rebuilt again within 15 minutes, after which the plugin will re-request the document. To keep the cached copy for a different length of time, use the cache attribute. The cached copy will be used for N minutes. If you don't want the plugin to cache the document at all, pass cache="0". To change the default caching duration, edit this line toward the top of the GetXML.pl file:
    my $DEFAULT_CACHE_MINS = 15;
  • errors="display|warn|die|ignore" (optional)
    This setting tells the plugin what to do if an error occurs in getting the XML document: either the specified URL cannot be retrieved, or the XML document is not well-formed and cannot be parsed. Which option you use in a given situation will depend on the context in which you're using the XML data.
    • display: (default) MTGetXML will display an error message on the built page. The template rebuild will proceed.
    • warn: The template rebuild will proceed, and a warning message containing the error will be shown in the rebuild window. MTGetXML will not display anything.
    • die: The template rebuild will be aborted and an error message will be shown in the rebuild window.
    • ignore: MTGetXML will not display anything, and the error will not be reported.

    To change the default setting, edit this line toward the top of the GetXML.pl file:

    my $DEFAULT_ERRORS = 'display';
  • keeproot="1" (optional)
    By default, XML::Simple omits the outermost, or root, element when parsing an XML document. Pass keeproot="1" to make XML::Simple include the root element (in which case you'll need to use an extra MTGetXMLElement container in order to access the data).
  • suppressempty="1" (optional)
    By default, if an XML element is empty (containing nothing, or only whitespace), XML::Simple will include an empty element in the parsed data structure, meaning that MTIfElementExists will consider that element to exist. Pass suppressempty="1" if you want empty elements to be omitted entirely.
  • noattr="1" (optional)
    By default, XML::Simple treats attributes to XML tags as nested elements. For example, take the following XML:
    <title language="english">A Book of Verses</title>
    In order to display the title, you would have to use this code:
    <MTGetXMLElement name="title">
     <$MTGetXMLValue name="content"$>
    </MTGetXMLElement>

    If you're not concerned with the data contained in the attributes, you can pass noattr="1" and XML::Simple will ignore attributes entirely, meaning you could display the title more directly, like this:

    <$MTGetXMLValue name="title"$>

MTGetXMLElement

This container tag lets you access the elements and attributes within a complex element, and loop through a series of sibling elements.

The tag takes the following attributes:

  • name="value" (required)
    The name of the element.
  • limit="N" (optional)
    When looping through a series of sibling elements, use this attribute to display only the first N elements.

MTGetXMLValue

This tag displays the value of a simple element.

The tag takes the following attributes:

  • name="value" (usually required)
    The name of the element. Can be omitted if the containing MTGetXMLElement tag is looping through a set of sibling elements that are all simple elements.
  • errors="display|warn|die|ignore" (optional)
    If you try to use MTGetXMLValue to display an element that's actually a complex element containing multiple values, the plugin will treat this as an error. The errors attribute determines how the error will be handled. The options are the same as described above under the errors attribute of MTGetXML.

MTIfXMLElementExists

This container tag displays its contents if and only if the specified element exists at the current level. (Note that the element does not have to contain anything; see the suppressempty attribute to MTGetXML if you don't want empty elements to be considered to exist.)

The tag takes the following attribute:

  • name="value" (required)
    The name of the element.

MTIfXMLElementNotExists

This container tag displays its contents if and only if the specified element does not exist at the current level.

The tag takes the following attribute:

  • name="value" (required)
    The name of the element.

MTGetXMLElementCount

This tag displays the total number of sibling elements with the specified name that exist at the current level.

The tag takes the following attribute:

  • name="value" (required)
    The name of the element(s).

MTGetXMLElementIndex

Within an MTGetXMLElement container that's looping through multiple sibling elements, this tag displays the index of the current loop iteration.

The tag takes the following attribute:

  • name="value" (required)
    The name of the element.

MTGetXMLURL

This tag displays the URL of the XML document that MTGetXML retrieved. This is useful if you're using MT template tags within the URL and aren't sure of the built values of the tags.

MTGetXMLCacheDate

This tag displays the date and time the XML document was last cached (it will display [Not cached] if the document has not been cached). Like other MT date tags, MTGetXMLCacheDate takes format and language attributes. See Date Tag Formats in the MT documentation for more information.

MTGetXMLDump

This tag displays a "dump" of the plugin's internal representation of the entire set of XML elements, as created by XML::Simple when it parsed the raw XML data. This display includes some Perl syntax, but even if you don't know Perl it should be fairly clear which elements are nested within which other elements, which elements have siblings, etc. This tag isn't intended to be used on a template you'll display to visitors, but it's useful for figuring out how to use the GetXML tags to work with the structure of a given XML document.

space_to_plus

This global tag attribute simply converts whitespace in any tag's output to plus signs. Multiple consecutive whitespace characters are condensed to a single plus sign. This can be useful for including multi-word MT fields in an XML URL. For example, if you want to pass your entry title as a query string to a search engine that returns results in XML:

<MTGetXML location="http://example.com/searchengine.cgi" q="[MTEntryTitle space_to_plus='1']">
...
</MTGetXML>

For an entry titled Some Thoughts on Cornflakes, the resulting URL would be:

http://example.com/searchengine.cgi?q=Some+Thoughts+on+Cornflakes

Usage Examples

Below are a few examples that use GetXML to display data from actual XML resources. They were all developed not by examining any DTDs but simply by looking at a sample of the XML data in a web browser and figuring out the structure.

  • Headlines from Slashdot

  • Slashdot syndicates its news headlines using its own XML format, called Backslash. The following code will display and link to the 5 most recent headlines, including the current comment count for each:
    <MTGetXML location="http://slashdot.org/slashdot.xml">
     <MTGetXMLElement name="story" limit="5">
      <p><a href="<$MTGetXMLValue name="url"$>"><$MTGetXMLValue name="title"></a>
      (<$MTGetXMLValue name="comments"$> comments)</p>
     </MTGetXMLElement>
    </MTGetXML>
  • Book Data from All Consuming

  • All Consuming tracks mentions of books (links to Amazon.com) on weblogs, and lets users sign up and create reading lists. There is an mt-allconsuming plugin, but it currently only supports the "currently reading" and "favorite books" lists. The author is planning to update the plugin to take advantage of all the different lists All Consuming makes available (which GetXML actually accesses through the REST interface, but the former page describes the various methods in more detail).

    Let's say you have certain entries in which you review books, and you use the Excerpt field of each entry to store the book's ISBN. The following code (used within MTEntries) will link to weblogs that have mentioned the book in the past 7 days. The data will be cached for 60 minutes. Since some results will have an excerpt from the weblog and some won't, we use suppressempty="1" so blank excerpts are considered not to exist, and then display the excerpt as the link text if there is one.

    <MTGetXML location="http://allconsuming.net/rest.cgi" 
    action="GetWeblogMentionsForBook" days_back="7" isbn="[MTEntryExcerpt]" 
    suppressempty="1" cache="60">
     <MTIfXMLElementExists name="weblogs">
      <p>Other people talking about this book:
      <ul>
      <MTGetXMLElement name="weblogs">
       <li><a href="<$MTGetXMLValue name="url"$>">
       <MTIfXMLElementExists name="excerpt">
        "<$MTGetXMLValue name="excerpt"$>..."
       </MTIfXMLElementExists>
       <MTIfXMLElementNotExists name="excerpt">
        <$MTGetXMLValue name="url"$>
       </MTIfXMLElementNotExists>
       </a></li>
      </MTGetXMLElement>
      </ul>
     </MTIfXMLElementExists>
    </MTGetXML>
  • Bible Passages from Crossway

  • The Bible publisher Crossway has an online version of their English Standard Version text of the Bible. They make the text available through an API that can be accessed with GetXML.

    Let's say you have entries in which you talk about Bible passages, and you store the passage's name (for example, Mark 1:10-15) in the Extended Entry field. The following code (within MTEntries) will display the numbered verses of the passage.

    <MTGetXML 
    location="http://www.gnpcb.org/esv/share/get/?key=TEST&passage=[MTEntryExtended space_to_plus='1']&action=doPassageQuery&output-format=crossway-xml-1.0" 
    cache="10080">
     <MTGetXMLElement name="passage">
      <MTGetXMLElement name="content">
       <MTGetXMLElement name="verse-unit">
        <$MTGetXMLValue name="verse-num"$>. 
        <$MTGetXMLValue name="content"$><br>
       </MTGetXMLElement>
      </MTGetXMLElement>
     </MTGetXMLElement>
    </MTGetXML>

    A few notes about this code:

    • Since we're using key=TEST, the actual text will be semi-gibberish; you'll need to obtain a free access key if you want to get actual text.
    • The space_to_plus global attribute is used on the passage name, so that the text will be suitable for use in a URL.
    • Since Bible verses are unlikely to change anytime soon, it seems reasonable to cache the data for a week (10080 minutes).
    • We have to put everything into the location attribute, because output-format is not a valid tag attribute name (thanks to the hyphen).
    • The content of each verse is actually a set of sibling elements, each of which contains a simple value, so we can use MTGetXMLValue to display them strung together. The following would accomplish the same thing:
          <MTGetXMLElement name="content">
           <$MTGetXMLValue$>
          </MTGetXMLElement><br>

Version History

7/12/04 - version 1.1

  • Plugin now registers itself with MT 3 interface.
  • Added $VERSION variable.
  • Conditional container tags are now declared as conditional tags, so they should work with MTElse.
  • All container tags now pass conditions along when building contents, so they'll work outside conditional tags within MTEntries, etc.

7/17/03 - version 1.0 released


The End As I Know It: A Novel of Millennial Anxiety, by staggernation.com proprietor Kevin Shay, is now available in paperback.

Please visit kshay.com for more information.