This blog can't be viewed on LiveJournal. Instead see http://www.apparently.me.uk/1006.html.

eXist: An XML Database

9th Dec 2006

I've been playing today with eXist, an open-source XML database. I've never really looked into XML databases before now, mostly because most of them are commercial and thus unavailable to me to toy with.

It's a bit of a paradigm shift for people like me that have grown up with relational databases and SQL. The databases consists of a heirarchical set of collections, inside which are resources. Resources can be arbitrary binary files or they can be XML documents. In the case of XML documents, you can then do queries across the database using XQuery, with optional indexes on the data to speed things up.

If we have a collection containing a bunch of weblog entries represented as RSS documents, we can select them as XHTML directly out of the database using the following query:

<html:body xmlns:html="http://www.w3.org/1999/xhtml"
           xmlns:dc="http://purl.org/dc/elements/1.1/">
{
for $entry in /rss/channel/item
order by $entry/date
return
    <html:div class="entry">
        <html:h1>
            <html:a href="{data($entry/link)}">
                {data($entry/title)}
            </html:a>
        </html:h1>
        <html:div class="entrytext">
            {data($entry/description)}
        </html:div>
        <html:div class="entryinfo">
            Posted on
            <html:a href="{data($entry/../link)}">
                {data($entry/../title)}
            </html:a>
            at
            {$entry/dc:date}
        </html:div>
    </html:div>
}
</html:body>

You can try this directly on the XQuery Sandbox demo on eXist's own site, which is also included with the distribution. They have their own weblog plus some other random things in their database that are — at the time of writing at least — successfully returned as a list of XHTML div elements.

XQuery uses XPath for its node matching and function library, and is capable of doing transforms from one XML format to another. In many ways, it's like a simpler version of XSLT. It's not a W3C Recommendation yet, but it's implemented in several places nonetheless. It's too bad that there doesn't seem to be a CPAN XQuery implementation yet, as I'd love to start using it in place of XSLT for a bunch of things. XSLT seems like a good fit for document-like XML (DocBook, XHTML, etc) while XQuery seems to me to make more sense for data-oriented XML like Atom. XQuery seems a lot more readable to me than XSLT as well.

Now I'm just trying to think of something I can actually use this XML Database for!

Comments