<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Bruised Edge &#187; Ruby</title>
	<atom:link href="http://weblog.kevinclarke.info/category/ruby/feed/" rel="self" type="application/rss+xml" />
	<link>http://weblog.kevinclarke.info</link>
	<description>Digital Libraries, Repositories, Programming, Technology, Librarianship, etc.</description>
	<lastBuildDate>Wed, 28 Jul 2010 03:19:41 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Libxml-Ruby vs. REXML in Ruby-MARC</title>
		<link>http://weblog.kevinclarke.info/2007/01/07/libxml-ruby-vs-rexml-in-ruby-marc/</link>
		<comments>http://weblog.kevinclarke.info/2007/01/07/libxml-ruby-vs-rexml-in-ruby-marc/#comments</comments>
		<pubDate>Mon, 08 Jan 2007 02:41:48 +0000</pubDate>
		<dc:creator>ksclarke</dc:creator>
				<category><![CDATA[MARC]]></category>
		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://kevinclarke.info/weblog/?p=255</guid>
		<description><![CDATA[This weekend I reimplemented the XMLReader and XMLWriter classes in ruby-marc using Libxml-Ruby, a Ruby layer over the Libxml2 C library. Currently, ruby-marc uses REXML, a pure Ruby XML library. Since REXML is built into Ruby, it is convenient. I was curious, though, how much of a performance boost there would be from using Libxml2. [...]]]></description>
			<content:encoded><![CDATA[<p>This weekend I reimplemented the XMLReader and XMLWriter classes in <a href="http://www.textualize.com/ruby_marc" title="ruby-marc">ruby-marc</a> using <a href="http://libxml.rubyforge.org/" title="ruby-libxml">Libxml-Ruby</a>, a Ruby layer over the <a href="http://xmlsoft.org/" title="libxml2">Libxml2 C library</a>.</p>
<p>Currently, ruby-marc uses <a href="http://www.germane-software.com/software/rexml/" title="rexml">REXML</a>, a pure Ruby XML library.  Since REXML is built into Ruby, it is convenient.  I was curious, though, how much of a performance boost there would be from using Libxml2.  Here are the results of my very informal test (using some HCL MARC data):</p>
<blockquote>
<table>
<tr>
<th></th>
<th>User</th>
<th>System</th>
<th>Total</th>
<th>Real</th>
</tr>
<tr>
<th>XMLReader [old]: </th>
<td>24.300000</td>
<td>0.030000</td>
<td>24.330000</td>
<td>25.607547</td>
</tr>
<tr>
<th>XMLReader [new]: </th>
<td>3.180000</td>
<td>0.010000</td>
<td>3.190000</td>
<td>3.231896</td>
</tr>
<tr>
<th>XMLWriter [old]: </th>
<td>38.960000</td>
<td>0.060000</td>
<td>39.020000</td>
<td>41.017238</td>
</tr>
<tr>
<th>XMLWriter [new]: </th>
<td>11.950000</td>
<td>0.050000</td>
<td>12.000000</td>
<td>12.607114</td>
</tr>
</table>
</blockquote>
<p>Both XMLWriter times include the new XMLReader reading records in from a source file.  As a record is read in, it is written out to a new file.  This is just intended to get an inkling of what the difference between the two versions might be (not to be a formal benchmark). Lower numbers are better.</p>
<p>So, in reimplementing, I completely rewrote the reader.  It just reads from a file and returns MARC::Record objects.  What is being used to read the XML is completely swappable with anything else.</p>
<p>With the writer, I changed the encode method so that it now takes an option specifying which library should be used (REXML is the default still).  Since the method is public, I figured someone is probably using those REXML Documents returned and their code would break if I returned a Libxml Document instead. The write method, on the other hand, now uses Libxml by default.</p>
<p>I haven&#8217;t checked in any of these changes yet (since I haven&#8217;t passed them by Ed and don&#8217;t know whether they should be incorporated), but I have validated that the existing tests still pass just fine.</p>
<p>The speed improvements are pretty nice. If an extra dependency can be tolerated it would be nice to have the performance boost.  The only other caveat is I used the 0.4.0pre01 version of Libxml-Ruby.  It might be desirable to wait until the final 0.4.0 release.</p>
<p>Anyway, I&#8217;ll get Ed&#8217;s opinion on all this sometime this next week. Right now, it is just a fun experiment.</p>
]]></content:encoded>
			<wfw:commentRss>http://weblog.kevinclarke.info/2007/01/07/libxml-ruby-vs-rexml-in-ruby-marc/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
