zaterdag 22 februari 2014

Every day PostgreSQL: Using XML functions for quick HTML previews.

Over tbe past months I've been working with OpenCV for image processing. One of the projects I'm working on requires comparin histograms. I won't go into any sort of detail about that, suffice it to say that I noe have a tabel that holds a few million comparison results from a few thousand images that have been compared to eachother. The next step is of course to process the results and find images that meet the requirements. A regular query can return the names of the images, but is need to see the actual images to check if the query returned matching images. I could un the query,fetch the data into a language and write code that generates HTML with the names of the images in an IMG tag, or I could use PostgreSQL's XML functions to generate an HTML page, so I only have to read one field from the resultset and print it.

-- Create a table of images
CREATE TEMPORARY TABLE foo (id int, filename TEXT);

-- Step one: create image tags
INSERT INTO foo (id, filename) VALUES (1,'basil.jpg'),(2,'sybil.jpg'),(3,'polly.jpg');

-- Step 2: aggreate the image tags inside a body tag.
-- I use CTE here to make the code more readable.
WITH imagetags AS ( SELECT XMLELEMENT(name img, XMLATTRIBUTES(filename AS src)) AS imgtag FROM foo )
SELECT XMLELEMENT(name body, XMLAGG(imgtag)) FROM imagetags;

-- The final code, added an HTML tag and a header with a CSS link.
WITH imagetags AS ( SELECT XMLELEMENT(name img, XMLATTRIBUTES(filename AS src)) AS imgtag FROM foo)
SELECT XMLELEMENT(name html, XMLELEMENT(name head, XMLELEMENT(name link, XMLATTRIBUTES('style.css' AS href, 'stylesheet' AS rel))),
XMLELEMENT(name body, XMLAGG(imgtag))) as h FROM imagetags;

Now all I have to do in the program is execute the query, fetch one record and print the value of column 'h'.

How mainatainable is this? Well, you could just plonk this into a view, which makes updating it a matter of executing the CREATE statement of that view once for every mutation, just like you'd press 'save' after updating a PHP file, and in both cases you have to reload the page to see the result, so there isn't much between both methods.