<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>brant interactive</title>
	<atom:link href="http://brantinteractive.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://brantinteractive.com</link>
	<description>code, insights, and observations</description>
	<lastBuildDate>Fri, 16 Jul 2010 15:20:49 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>ActsAsCachola &#8211; simple caching for AR models (everyone could use a little cachola)</title>
		<link>http://brantinteractive.com/2010/07/16/actsascachola-simple-caching-for-ar-models-everyone-could-use-a-little-cachola/</link>
		<comments>http://brantinteractive.com/2010/07/16/actsascachola-simple-caching-for-ar-models-everyone-could-use-a-little-cachola/#comments</comments>
		<pubDate>Fri, 16 Jul 2010 15:20:49 +0000</pubDate>
		<dc:creator>Rich Brant</dc:creator>
				<category><![CDATA[Ruby/Rails]]></category>

		<guid isPermaLink="false">http://brantinteractive.com/?p=460</guid>
		<description><![CDATA[And you thought all the clever caching names were taken. What is it ActsAsCachola is a plugin that lets you cache any class method by simply prepending ‘cachola_’ to the method name when calling it. Here’s how it works: Given the following model: class Internet &#60; ActiveRecord::Base &#160;&#160;acts_as_cachola &#160; &#160;&#160;def self.get_a_million_numbers &#160;&#160;&#160;&#160;1.upto(1_000_000).inject([]){ &#124;numbers, x&#124; numbers [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>And you thought all the clever caching names were taken.</p>
<h2>What is it</h2>
<p>ActsAsCachola is a plugin that lets you cache any class method by simply prepending ‘cachola_’ to the method name when calling it. Here’s how it works:</p>
<p>Given the following model:</p>
<pre>
class Internet &lt; ActiveRecord::Base
&nbsp;&nbsp;acts_as_cachola
&nbsp;
&nbsp;&nbsp;def self.get_a_million_numbers
&nbsp;&nbsp;&nbsp;&nbsp;1.upto(1_000_000).inject([]){ |numbers, x| numbers &lt;&lt; x }
&nbsp;&nbsp;end
end
</pre>
<p>Now you can call the method, ‘cachola_get_a_million_numbers,’ and the return value of ‘get_a_million_numbers’ will be cached automatically.</p>
<p>Note that if the method accepts arguments, each unique call will have its own key in the cache. For example:</p>
<pre>
class Internet &lt; ActiveRecord::Base
&nbsp;&nbsp;acts_as_cachola
&nbsp;
&nbsp;&nbsp;def self.get_numbers(to_number)
&nbsp;&nbsp;&nbsp;&nbsp;1.upto(to_number).inject([]){ |numbers, x| numbers &lt;&lt; x }
&nbsp;&nbsp;end
end
</pre>
<p>Calling Internet.cachola_get_numbers(100) and Internet.cachola_get_numbers(500) will result in two keys (with different values) stored in the cache.</p>
<p>The cached method is then expired automatically when the class in which the plugin has been included is saved or destroyed. It’s restored to the cache the next time it’s called.</p>
<p>Now, what if your Internet class method ‘get_a_million_numbers’ depends on other objects getting saved or destroyed? That’s the other thing I wanted to make easier. Rather than setting up observers or sweepers, you can add the following to the other model:</p>
<pre>
class WhereAmI &lt; ActiveRecord::Base
&nbsp;&nbsp;acts_as_cachola_notifier =&gt; [:internet]
end
</pre>
<p>Now when your WhereAmI model is ether saved or destroyed, the cached methods in the Internet model will be deleted.</p>
<h2>Installation</h2>
<p>script/plugin install git://github.com/rbrant/acts_as_cachola.git</p>
<h2>Where is this going from here?</h2>
<p>Not sure. It does what I need it to do right now. It’s something I’ve found myself doing on two different projects that I thought would just make my life easier.</p>
<h2>Project Info</h2>
<p>ActsAsCachola is hosted on Github: <a href="http://github.com/rbrant/acts_as_cachola">http://github.com/rbrant/acts_as_cachola</a>, where your contributions, forkings, comments and feedback are greatly appreciated. Please do add tests if you want me to pull in any changes.</p>
]]></content:encoded>
			<wfw:commentRss>http://brantinteractive.com/2010/07/16/actsascachola-simple-caching-for-ar-models-everyone-could-use-a-little-cachola/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>I want to show you some SAX.</title>
		<link>http://brantinteractive.com/2010/03/12/i-want-to-show-you-some-sax/</link>
		<comments>http://brantinteractive.com/2010/03/12/i-want-to-show-you-some-sax/#comments</comments>
		<pubDate>Fri, 12 Mar 2010 18:11:05 +0000</pubDate>
		<dc:creator>Rich Brant</dc:creator>
				<category><![CDATA[Ruby/Rails]]></category>

		<guid isPermaLink="false">http://brantinteractive.com/?p=440</guid>
		<description><![CDATA[I had to process some pretty big xml docs recently from the USPTO. Each doc is about 60mb and (oddly enough) contains several thousand individual documents all concatenated. So the document isn&#8217;t valid xml..but that&#8217;s a different story. The reason for writing this was to show a quick demo of how to use SAX to [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>I had to process some pretty big xml docs recently from the USPTO. Each doc is about 60mb and (oddly enough) contains several thousand individual documents all concatenated. So the document isn&#8217;t valid xml..but that&#8217;s a different story. </p>
<p>The reason for writing this was to show a quick demo of how to use SAX to process a large XML file. <a href="http://www.xml.com/pub/a/2001/02/14/perlsax.html">You can read about SAX here</a>, but basically, SAX (Simple API for XML) is an event-driven model that solves the problem of having to read an entire tree structure into memory which can be realllly sloooow, and instead reads the stream of data and raises events along the way.</p>
<p>The code below uses the <a href="http://nokogiri.org/">Nokogiri</a> library (which as a side note has this odd, albeit entertaining tagline: &#8220;XML is like violence &#8211; if it doesn’t solve your problems, you are not using enough of it.&#8221;). Most other XML parsing libraries also have SAX implementations.</p>
<p>What the code does below is looks for the root node of each doc and builds a string for each individual document. After the doc has been assembled, the doc can be processed via the more pleasant:</p>
<pre>
doc = Nokogiri::HTML(xml)
serial = doc.css(&quot;application-reference document-id doc-number&quot;).inner_text
</pre>
<p>So this ends up being sort of a hybrid and much, much faster than loading the entire doc at once. It would be faster not parsing the doc again at all but the docs have too much nested complexity that requires the ability to use xpath to get at what I need.</p>
<p><script src="http://gist.github.com/330542.js?file=sax"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://brantinteractive.com/2010/03/12/i-want-to-show-you-some-sax/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Good stuff from Bob Martin on software. Like the stuff at 11:20 inre 	tech/biz disconnect. Not sure about the shirt, however.</title>
		<link>http://brantinteractive.com/2010/02/17/good-stuff-from-bob-martin-on-software-like-the-stuff-at-1120-inre-techbiz-disconnect-not-sure-about-the-shirt-however/</link>
		<comments>http://brantinteractive.com/2010/02/17/good-stuff-from-bob-martin-on-software-like-the-stuff-at-1120-inre-techbiz-disconnect-not-sure-about-the-shirt-however/#comments</comments>
		<pubDate>Wed, 17 Feb 2010 21:36:32 +0000</pubDate>
		<dc:creator>rbrant</dc:creator>
				<category><![CDATA[Ruby/Rails]]></category>

		<guid isPermaLink="false">http://brantinteractive.com/2010/02/17/good-stuff-from-bob-martin-on-software-like-the-stuff-at-1120-inre-techbiz-disconnect-not-sure-about-the-shirt-however/</guid>
		<description><![CDATA[Posted via email from Rich&#8217;s posterous]]></description>
			<content:encoded><![CDATA[<p></p><div class='posterous_autopost'>
<p style="font-size: 10px">  <a href="http://posterous.com">Posted via email</a>   from <a href="http://rbrant.posterous.com/good-stuff-from-bob-martin-on-software-like-t">Rich&#8217;s posterous</a>  </p>
</p></div>
]]></content:encoded>
			<wfw:commentRss>http://brantinteractive.com/2010/02/17/good-stuff-from-bob-martin-on-software-like-the-stuff-at-1120-inre-techbiz-disconnect-not-sure-about-the-shirt-however/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Seth Godin&#8217;s ebook, &#8216;What matters now&#8217; Free download. Tons of great 	ideas/things to think about.</title>
		<link>http://brantinteractive.com/2009/12/15/seth-godins-ebook-what-matters-now-free-download-tons-of-great-ideasthings-to-think-about/</link>
		<comments>http://brantinteractive.com/2009/12/15/seth-godins-ebook-what-matters-now-free-download-tons-of-great-ideasthings-to-think-about/#comments</comments>
		<pubDate>Tue, 15 Dec 2009 14:18:10 +0000</pubDate>
		<dc:creator>rbrant</dc:creator>
				<category><![CDATA[Ruby/Rails]]></category>

		<guid isPermaLink="false">http://brantinteractive.com/2009/12/15/seth-godins-ebook-what-matters-now-free-download-tons-of-great-ideasthings-to-think-about/</guid>
		<description><![CDATA[Download now or preview on posterous what-matters-now-2.pdf (3073 KB) Posted via email from Rich&#8217;s posterous]]></description>
			<content:encoded><![CDATA[<p></p><div style='padding: 5px 5px 10px 5px;margin-top: 5px;border: 1px solid #ddd;background-color: #fff;line-height: 16px'>
<div style="float: left;margin-right: 5px;overflow: visible"><a href='http://posterous.com/getfile/files.posterous.com/rbrant/XSiM2ov12LAaCYnUy61I90kCiAhpVUFCvJLeB70QfGM8dpETrotwE94TmDHz/what-matters-now-2.pdf'><img src='http://posterous.com/images/filetypes/pdf.png'></a></div>
<div style="font-size: 10px;color: #424037;line-height: 16px">Download now or <a href='http://rbrant.posterous.com/seth-godins-ebook-what-matters-now-free-downl'>preview on posterous</a></div>
<p>       <b><a href='http://posterous.com/getfile/files.posterous.com/rbrant/XSiM2ov12LAaCYnUy61I90kCiAhpVUFCvJLeB70QfGM8dpETrotwE94TmDHz/what-matters-now-2.pdf'>what-matters-now-2.pdf</a></b> <span style="font-size: 10px;color: #424037">(3073 KB)</span>       </div>
<p style="font-size: 10px">  <a href="http://posterous.com">Posted via email</a>   from <a href="http://rbrant.posterous.com/seth-godins-ebook-what-matters-now-free-downl">Rich&#8217;s posterous</a>  </p>
]]></content:encoded>
			<wfw:commentRss>http://brantinteractive.com/2009/12/15/seth-godins-ebook-what-matters-now-free-download-tons-of-great-ideasthings-to-think-about/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Project review: ifwinsight.com</title>
		<link>http://brantinteractive.com/2009/12/06/project-review-ifwinsight-com/</link>
		<comments>http://brantinteractive.com/2009/12/06/project-review-ifwinsight-com/#comments</comments>
		<pubDate>Mon, 07 Dec 2009 01:09:22 +0000</pubDate>
		<dc:creator>Rich Brant</dc:creator>
				<category><![CDATA[Ruby/Rails]]></category>
		<category><![CDATA[Software Development]]></category>
		<category><![CDATA[Tools & Tips]]></category>

		<guid isPermaLink="false">http://brantinteractive.com/?p=406</guid>
		<description><![CDATA[It&#8217;s easy to forget what you&#8217;ve learned and what tools you used from project to project. I thought it might be worthwhile to sort of sum up these things either on a weekly basis or project basis. I had a lot of fun on a recent project and thought it would be a good place [...]]]></description>
			<content:encoded><![CDATA[<p></p><p>It&#8217;s easy to forget what you&#8217;ve learned and what tools you used from project to project. I thought it might be worthwhile to sort of sum up these things either on a weekly basis or project basis. I had a lot of fun on a recent project and thought it would be a good place to start. I recently built what is described as a &#8216;tool for intelligently searching US patent application Image File Wrappers (IFWs).&#8217;</p>
<p>Technically, the system allows users to upload PDF documents and have their content indexed and made searchable. The documents are reasonably sized, averaging 25 megs each with several hundred pages. So once uploaded to the server, they are handed to <a href="http://github.com/collectiveidea/delayed_job">delayed job</a> to be processed in the background. I&#8217;m using collective idea&#8217;s fork after watching <a href="http://railscasts.com/episodes/171-delayed-job">Ryan Bates&#8217; screencast on delayed job</a> that points out this fork has a few generators and rake tasks not part of the original.</p>
<p>In order to index the document, the PDF needs to be examined by OCR (optical character recognition) software. But before the OCR software can do its OCR-ing, it needs to have an image to examine. So we need to convert the individual PDF pages into images. To accomplish that, I used <a href="http://pages.cs.wisc.edu/~ghost/">ghostscript</a>. It&#8217;s really easy to use and fast. You can hand ghostscript the document, and it will churn out a an image of each PDF page, in the resolution of your choice. I&#8217;m using 300&#215;300, which seems to be a nice balance between processing time, space, and readability/ocr results.</p>
<p>Once the document has been converted into images the OCR software, <a href="http://code.google.com/p/tesseract-ocr/">tesseract-ocr</a>, will iterate through each image and produce a text file with the contents of the page. Now, with a directory full of text files, it&#8217;s time to store the contents of each page in the database. That&#8217;s where <a href="http://sphinxsearch.com/">sphinx</a> and <a href="http://freelancing-god.github.com/ts/en/">thinking sphinx</a> come into play. Sphinx is the full text search engine and thinking sphinx is a &#8216;concise and easy-to-use Ruby library that connects ActiveRecord to the Sphinx search daemon, managing configuration and searching.&#8217; I actually started the project with ferret/acts_as_ferret, but after reading so many good reviews of sphinx, and my own problems with ferret, I switched. The only downside is that setup is a little trickier and thinking sphinx doesn&#8217;t automatically update the index the way acts_as_ferret does, so there&#8217;s a cron job that handles that. The indexer is super fast though, so frequent indexing isn&#8217;t a problem.</p>
<p>The site also offers multiple file download, and I used the rubyzip library which makes it simple to zip up a bunch of docs into one.</p>
<p>As for design, we used a theme from themeforest.net. I was impressed by the quality of generic templates they have. They aren&#8217;t free, but are dirt cheap &#8211; $5 or $10  for most. I&#8217;ve used</p>
<p>It&#8217;s a Rails application, so as for gems/plugins, the usual suspects are there:  acts_as_commentable, exception_notification, restful_authentication, role_requirement, mislav-will_paginate, attachment_fu, and a few others: slicehost, thinking-sphinx, delayed_job, and rubyzip.</p>
<p>The real jewel in the list is &#8216;<a href="http://github.com/josh/slicehost">slicehost</a>&#8216; which gives you a bunch of rake tasks for setting up your slice at <a href="http://www.slicehost.com/">slicehost</a>, which is my favorite hosting provider. </p>
<p>One other thing worth mentioning was an issue with the delayed_job process not stopping properly during deploys, so I kept getting multiple instances of delayed job running because the one running during the deploy never stopped. It was noted on github (with the solution below) in the issues section but I can&#8217;t find it now. Basically, the restart task looks like this:</p>
<pre>
desc &quot;Restart the delayed_job process&quot;
&nbsp;&nbsp;task :restart, :roles =&gt; :app do
&nbsp;&nbsp;stop
&nbsp;&nbsp;wait_for_process_to_end(&#039;delayed_job&#039;)
&nbsp;&nbsp;start
&nbsp;&nbsp;end
end
&nbsp;
def wait_for_process_to_end(process_name)
&nbsp;&nbsp;run &quot;COUNT=1; until [ $COUNT -eq 0 ]; do COUNT=`ps -ef | grep -v &#039;ps -ef&#039; | grep -v &#039;grep&#039; | grep -i &#039;#{process_name}&#039;|wc -l` ; echo &#039;waiting for #{process_name} to end&#039; ; sleep 2 ; done&quot;
end
</pre>
<p>As a final note, all the software used in this project is open source. I&#8217;m constantly reminded of that and impressed by it. My thanks to all who have contributed to the software used in this project!</p>
]]></content:encoded>
			<wfw:commentRss>http://brantinteractive.com/2009/12/06/project-review-ifwinsight-com/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
