Extending the life of my Cocoon






As anyone who has worked with me will know, I value my sanity. So, I like to keep things simple.

When it comes to building web-apps I nearly always start with the Apache Java framework Cocoon 2.12 .

I use the 2.12 model as I have a build of it that allows me to merge in my development blocks without rebuilding the beast over and over. I can do the heavy lifting of an application with XSLT 2.0, so life is pretty sweet.

By using Cocoon as the framework to 'glue' everything together I am able to re-use working pipelines that have served me for the last 10 years. However, with the advent of JSON as the data transfer format of choice and the simplicity of things like Cube for handling time-series data I knew it was time to find a more productive way of writing generators.

After some experimentation it dawned on me that Node.js supports the piping of JSON streams. This has allowed me to still write my pipelines in Cocoon but to proxy off to a Node.js service which is responsible for converting MongoDb JSON response into an XML stream for consumption by the transformer stage of my Cocoon pipeline.

This may sound like gobble-de-gook to you but it's real competitive advantage for me. Before this breakthrough, I was using HTTP GET to fetch JSON from MongoDb. My pipeline would have to wait whilst the JSON was received in full before the resulting text could be converted to XML. Ok for small queries but in the new world of big-data this was taking much too long. Now, the Cocoon file generator can proxy off to the Node.js service URI and let Node.js stream JSON in bite size XML elements for processing by the Cocoon pipeline. This fits very nicely with the SAX based architecture of Cocoon and optimises processing.

God is in the details so let's have a look at how this technique reduces the time required to deliver the Excel work-book by an order or two of magnitude.

First, the Cocoon sitemap.xmap matcher that picks up the user request for an .xlsx Excel s/s from some Cube data stored in MongoDB :-

<map:match pattern="node.xlsx">

<map:act type="set-header">

<map:parameter name="Cache-Control" value="Pragma:no-cache" />

<map:parameter name="Content-Disposition" value="attachment;filename=node.xlsx" />

</map:act>

<map:generate type="jx" src="zipNodeXlsx.xml" label="xml"/>

<map:serialize type="zip" mime-type="application/vnd.ms-excel"/>

</map:match>



This zips up various XML files, as pointed to by the zipNodeXLsx.xml file :-


<zip:archive xmlns:zip="http://apache.org/cocoon/zip-archive/1.0">

<zip:entry name="[Content_Types].xml" src="cocoon:/contentTypes"/>

<zip:entry name="xl/worksheets/sheet.xml" src="cocoon:/nodeQuery?query=acmsg(name,site,region,msg).re(msg,'${cocoon.request.getParameter('pattern')}')&start=${cocoon.request.getParameter('start')}&start=${cocoon.request.getParameter('stop')}&limit=${cocoon.request.getParameter('limit')}"/>

<zip:entry name="xl/styles.xml" src="cocoon:/styleXl"/>

<zip:entry name="xl/workbook.xml" src="cocoon:/workBook"/>

<zip:entry name="xl/_rels/workbook.xml.rels" src="cocoon:/workRels"/>

<zip:entry name="_rels/.rels" src="cocoon:/topRels"/>

</zip:archive>



Which uses the Node.js service to execute the query using the simple pipeline :-


<map:match pattern="nodeQuery">

<map:generate src="http://localhost:1337/?expression={url-encode:{request-param:query}}&start={url-encode:{request-param:start}}&stop={url-encode:{request-param:stop}}&limit={url-encode:{request-param:limit}}"/>

<map:transform type="saxon" src="excelNodexml.xsl"/>

<map:serialize type="xml"/>

</map:match>



And this is the Node.js service :-


var http = require('http')
, request = require('request')
, jss = require('JSONStream')
, es = require('event-stream')
, js2 = require('js2xmlparser')
, fs = require('fs')
, url = require('url')
, options = { declaration: { include: false }, prettyPrinting: { enabled: false } };




var server = http.createServer(function (req, res) {

// req is an http.IncomingMessage, which is a Readable Stream

// res is an http.ServerResponse, which is a Writable Stream

// we want to get the data as utf8 strings

req.setEncoding('utf8');

// Readable streams emit 'data' events once a listener is added

req.on('data', function (chunk) {})

var query = url.parse(req.url,true).query;

res.write('');

req.on('end', function () {

try {

readable=request({url: "http://localhost:1081/1.0/event?expression="+query.expression+"&start="+query.start+"&stop="+query.stop+"&limit="+query.limit})

.pipe(jss.parse())

.pipe(es.mapSync(function(data){return js2('match',data,options)}));

readable.on('data', function(chunk){


res.write(chunk)});


readable.on('end', function(chunk){

res.write('');

res.end()});

res.statusCode = 200;

} catch (er) {

res.statusCode = 400;

return res.end('Error: ' + er.message);

}

})

})


server.listen(1337);


So, brothers and sisters, I don't have to write and debug any new generators in Java.

Comments

Popular posts from this blog