Tuesday, January 19, 2010

xquery4j in action

In my previous article, I introduced a wrapper library for Saxon, xquery4j http://github.com/rafalrusin/xquery4j.
Here, I will explain how to use it to create an article generator in Java and XQuery for XHTML, called Article. You can download it here: http://github.com/rafalrusin/Article. It's a simple DSL for article generation.

I think it is something worth noticing, because the whole project took me just a while to implement and has interesting features. Those are:
  • embedded code syntax highlighting for a lot of programming languages (using external program highlight),
  • creating href entries for links, so you don't need to type URL twice
  • it integrates natively with XHTML constructs

This is an example of an input it takes:
<a:article xmlns='http://www.w3.org/1999/xhtml' xmlns:a="urn:article">
<a:l>Some text</a:l>
<a:code lang="xml"><![CDATA[
<someXml/>
]]></a:code>
</a:article>

It generates XHTML output for it, using command
./run <input.xml >output.xhtml

The interesting thing is that XQuery expression for this transformation is very simple to do in Saxon. This is the complete code of it:
declare namespace a="urn:article";
declare default element namespace "http://www.w3.org/1999/xhtml";

declare function a:processLine($l) {
for $i in $l/node()
return
typeswitch ($i)
  case element(a:link, xs:untyped) return <a href="{$i/text()}">{$i/text()}</a>
  default return $i
};

declare function a:articleItem($i) {
typeswitch ($i)
 case element(a:l, xs:untyped) return (a:processLine($i), <br/>)

 case element(a:code, xs:untyped) return
  ( a:highlight($i/text(), $i/@lang)/body/* , <br/>)

 default return "error;"
};

<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>a.xml</title>
<link rel="stylesheet" type="text/css" href="highlight.css"/>
</head>
<body>
{
for $i in a:article/*
return
a:articleItem($i)
}
</body>
</html>

Inside this expression, there is bound a:highlight Java function, which takes two strings on input (a code and a language) and returns DOM Node containing XHTML output from highlight command.
Since there is not much trouble with manipulating DOM using xquery4j, we can get as simple solution as this for a:highlight function:
    public static class Mod {
        public static Node highlight(final String code, String lang) throws Exception {
            Validate.notNull(lang);
            final Process p = new ProcessBuilder("highlight", "-X", "--syntax", lang).start();
            Thread t = new Thread(new Runnable() {

                public void run() {
                    try {
                        OutputStream out = p.getOutputStream();
                        IOUtils.write(code, out);
                        out.flush();
                        out.close();
                    } catch (IOException e) {
                        throw new RuntimeException(e);
                    }
                }
            });
            t.start();
            String result = IOUtils.toString(p.getInputStream());
            t.join();
            return DOMUtils.parse(result).getDocumentElement();
        }
    }

Please note that creating a separate thread for feeding input into highlight command is required, since Thread's output queue is limited and potentially might lead to dead lock. So we need to concurrently collect output from spawned Process.
However at the end, when we need to convert a String to DOM and we use xquery4j's DOMUtils.parse(result), so it's a very simple construct.

Saturday, January 16, 2010

Embedding XQuery in Java

XQuery is a very powerful language. It can be very useful when you want to do some XML processing in Java.
Let's say you want to create an XML document based on some other XML data. Given something like this:
<employees>
  <employee>
    <name>Fred Jones</name>
    <address location="home">
      <street>900 Aurora Ave.</street>
      <city>Seattle</city>
      <state>WA</state>
      <zip>98115</zip>
    </address>
    <address location="work">
      <street>2011 152nd Avenue NE</street>
      <city>Redmond</city>
      <state>WA</state>
      <zip>98052</zip>
    </address>
    <phone location="work">(425)555-5665</phone>
    <phone location="home">(206)555-5555</phone>
    <phone location="mobile">(206)555-4321</phone>
  </employee>
 </employees>

You want to produce employees' names:
<names>
  <name>Fred Jones</name>
</names>

In XQuery it's just as easy as:
<names>
  {for $name in employees/employee/name/text() return <name>{$name}</name>}
</names>

The most interesting advantage for XQuery over various other methods for generating XML is that XQuery operates natively on XML.
There are tools like JaxB, XmlBeans, which enable strongly typed XML building directly from Java code. But using such approach often requires a lot of Java code to be written, which is not really necessary.
There is also a possibility to use XPath. However it's an inferior solution to XQuery, because XPath doesn't provide a way for building XML documents. It's designed only for nodes selection. On the other hand, XQuery extends XPath, so it supports every construct that XPath does.
Another way to do such processing in Java is to use XSLT. It's the closest approach to XQuery. But the problem with XSLT is that it has its language constructs, like 'for' expressed as XML elements. This makes writing XSLT code much more difficult that XQuery.
XQuery can be seen as a native template language for XML processing.
So the question is: how to evaluate XQuery expressions the best way from Java?
There are some Open Source implementations of XQuery for Java. One of them is inside XmlBeans. However in my opinion the best way is to use Saxon. It's the most mature project for XQuery processing and it's targetted directly for doing that.
However Saxon might be a bit difficult to use directly. At least digging a few interesting features from it took me some time.
So I decided to write a simple class for interfacing Saxon and to provide a few interesting examples of how to use it. That's how xquery4j was born. You can download it from github http://github.com/rafalrusin/xquery4j.
In xquery4j, you can execute XQuery expressions from Java in a simple way:
XQueryEvaluator evaluator = new XQueryEvaluator();
Long result = (Long) evaluator.evaluateExpression("5+5", null).get(0);
Assert.assertEquals(new Long(10), result);

It's possible to bind variables from Java objects, the easy way:
evaluator.bindVariable(QName.valueOf("myVar"), 123);

Sometimes it's also useful to declare Java methods and bind them for XQuery expressions. This is also very simple to do with xquery4j:
public static class MyFunctions {
        public static String myHello(String arg) {
            TestEvaluator te = (TestEvaluator) XQueryEvaluator.contextObjectTL.get();
            te.id++;
            return "hello(" + arg + te.id + ")";
        }
    }

    XQueryEvaluator evaluator = new XQueryEvaluator();
    evaluator.setContextObject(this);
    evaluator.declareJavaClass("http://my.org/employees", MyFunctions.class);
}

This code sets a context object to 'this' and binds all static methods from MyFunctions class to XQuery expressions. So during myHello execution from XQuery, you can easily operate on Java variables from bound context - 'id' in this case.
Here's a way of invoking such bound myHello method from XQuery:
declare namespace my = 'http://my.org/employees'; my:myHello("hello")

xquery4j code contains unit tests, which include examples above.
You can run them by:
mvn package

Feel free to give some feedback on using it.