MARC2Solr (Slight Return)

Awhile back, Andrew Nagy posted an XSLT for turning MARCXML into Solr’s XML indexing format. I thought it would be fun to take his XSLT and do the same thing in XQuery. I think it is pretty much a 1 to 1 conversion.

For the upcoming Code4Lib preconference, I thought about forming an XQuery group. I ended up joining the Java group, though, because there aren’t any native HTTP libs in XQuery (so I’d have to do that as an extension in Java anyway). I still think doing an XQuery group would be fun though.

For instance, one nice feature of XQuery is that is allows you to be as strongly or loosely typed as you’d like. Take off all the “as …” statements from the XQuery and it still works just fine (it just won’t be so picky about what you pass into (or return from) its functions).

Recently, I’ve found myself on both sides of this fence; when working with a little bit of throw-away Java code, I’ve found myself wishing for a little of Ruby’s loose typing. On the other hand sometimes, when experimenting with Ruby, I mutter to myself: “Why can’t this just be strongly typed so I know what to expect and do?”

XQuery really gives you the best of both worlds. This isn’t to say XQuery can do everything those other languages can (it can’t… and far from it). But, if you are working with XML (and want to focus on the data rather than the data’s source) I can’t think of a nicer language to use. It will be interesting to watch XQuery grow as a programming language.

So anyway… since my marc2solr.xq is written as a module you’ll need to call it from something else. This little XQuery (also here) works fine from Saxon (pass in the location of a MARCXML file on the file system as $input):

xquery version "1.0";

import module
  namespace marc2solr = "http://lisforge.net/ns/marc2solr"
  at "marc2solr.xq";

declare variable $input external;

marc2solr:add-records(doc($input))
February 19, 2007 • Posted in: Search, XQuery

Leave a Reply