<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Allen Day's Blog &#187; Mahout</title>
	<atom:link href="http://www.spicylogic.com/allenday/blog/category/computing/distributed-systems/hadoop/mahout/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.spicylogic.com/allenday/blog</link>
	<description>♥data♥</description>
	<lastBuildDate>Mon, 21 Jun 2010 23:28:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Taste item-item recommender example</title>
		<link>http://www.spicylogic.com/allenday/blog/2009/02/11/taste-item-item-recommender-example/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2009/02/11/taste-item-item-recommender-example/#comments</comments>
		<pubDate>Wed, 11 Feb 2009 22:10:00 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
				<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Mahout]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/2009/02/11/taste-item-item-recommender-example/</guid>
		<description><![CDATA[I threw together a Mahout/Taste based item-item based recommender last night.

	public static void itemItemRecommendations&#40;String path, String file&#41; &#123;
		File f = new File&#40;path, file&#41;;
	    try &#123;
			DataModel model = new FileDataModel&#40;f&#41;;
			model.refresh&#40;null&#41;;
		    ItemSimilarity itemSimilarity = new LogLikelihoodSimilarity&#40;model&#41;;
		    ItemBasedRecommender itemRecommender = new GenericItemBasedRecommender&#40;model, itemSimilarity&#41;;
		    for &#40; Item [...]]]></description>
			<content:encoded><![CDATA[<p>I threw together a Mahout/Taste based item-item based recommender last night.</p>

<div class="wp_syntax"><div class="code"><pre class="java">	<span style="color: #000000; font-weight: bold;">public</span> <span style="color: #000000; font-weight: bold;">static</span> <span style="color: #993333;">void</span> itemItemRecommendations<span style="color: #66cc66;">&#40;</span><span style="color: #aaaadd; font-weight: bold;">String</span> path, <span style="color: #aaaadd; font-weight: bold;">String</span> file<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
		<span style="color: #aaaadd; font-weight: bold;">File</span> f = <span style="color: #000000; font-weight: bold;">new</span> <span style="color: #aaaadd; font-weight: bold;">File</span><span style="color: #66cc66;">&#40;</span>path, file<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">;</span>
	    <span style="color: #000000; font-weight: bold;">try</span> <span style="color: #66cc66;">&#123;</span>
			DataModel model = <span style="color: #000000; font-weight: bold;">new</span> FileDataModel<span style="color: #66cc66;">&#40;</span>f<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">;</span>
			model.<span style="color: #006600;">refresh</span><span style="color: #66cc66;">&#40;</span><span style="color: #000000; font-weight: bold;">null</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">;</span>
		    ItemSimilarity itemSimilarity = <span style="color: #000000; font-weight: bold;">new</span> LogLikelihoodSimilarity<span style="color: #66cc66;">&#40;</span>model<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">;</span>
		    ItemBasedRecommender itemRecommender = <span style="color: #000000; font-weight: bold;">new</span> GenericItemBasedRecommender<span style="color: #66cc66;">&#40;</span>model, itemSimilarity<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">;</span>
		    <span style="color: #b1b100;">for</span> <span style="color: #66cc66;">&#40;</span> Item i : model.<span style="color: #006600;">getItems</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#41;</span>
			    <span style="color: #b1b100;">for</span> <span style="color: #66cc66;">&#40;</span> RecommendedItem j : itemRecommender.<span style="color: #006600;">mostSimilarItems</span><span style="color: #66cc66;">&#40;</span>i.<span style="color: #006600;">getID</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>, <span style="color: #cc66cc;">50</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#41;</span>
			    	<span style="color: #b1b100;">if</span> <span style="color: #66cc66;">&#40;</span> j.<span style="color: #006600;">getValue</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&gt;</span>= <span style="color: #cc66cc;">0.7</span> <span style="color: #66cc66;">&#41;</span>
			    		<span style="color: #aaaadd; font-weight: bold;">System</span>.<span style="color: #006600;">out</span>.<span style="color: #006600;">println</span><span style="color: #66cc66;">&#40;</span>i.<span style="color: #006600;">getID</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> + <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\t</span>&quot;</span> + j.<span style="color: #006600;">getItem</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>.<span style="color: #006600;">getID</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span> + <span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\t</span>&quot;</span> + <span style="color: #aaaadd; font-weight: bold;">String</span>.<span style="color: #006600;">format</span><span style="color: #66cc66;">&#40;</span><span style="color: #ff0000;">&quot;%.3f&quot;</span>, j.<span style="color: #006600;">getValue</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">;</span>
		<span style="color: #66cc66;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #66cc66;">&#40;</span><span style="color: #aaaadd; font-weight: bold;">FileNotFoundException</span> e<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
			<span style="color: #808080; font-style: italic;">// TODO Auto-generated catch block</span>
			e.<span style="color: #006600;">printStackTrace</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">;</span>
		<span style="color: #66cc66;">&#125;</span> <span style="color: #000000; font-weight: bold;">catch</span> <span style="color: #66cc66;">&#40;</span>TasteException e<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
			<span style="color: #808080; font-style: italic;">// TODO Auto-generated catch block</span>
			e.<span style="color: #006600;">printStackTrace</span><span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">;</span>
		<span style="color: #66cc66;">&#125;</span>
	<span style="color: #66cc66;">&#125;</span></pre></div></div>

<p>This outputs item1 &#8211;recommends&#8211;>item2 pairs with a weight.  I&#8217;m taking this and putting it into a solr document so I can display related item2s alongside item1 when it&#8217;s viewed.</p>
<p>Input data are comma-delimited <userID,itemID,score> tuples like so:</p>
<pre>
1fe7401b81eed49353d0cbeba5383848,5212,0.6
3c1832954a6e8781836fed670bb37b24,5212,1
70273e4c7c77700ee97acb8d0306c405,5213,0.8
1f057ccde135acbc881008bbf466e7e1,5213,1
51d44c7baca65ad39d11ba87bf2d438b,5213,1
adc924559b37114cd97d1f5cf7c71419,5213,1
78e254b4a11e61d76ff63cea02de4de8,5213,1
5c373ec7d9ad4a6f392c291d8ccba5ce,5213,0.2
fab8537564094fa8885f6214e6b682e1,5213,1
127f46aabcdbc2d2d04da8398a996c75,5213,1
</pre>
<p>Works great.  Thanks Sean.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2009/02/11/taste-item-item-recommender-example/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Mahout ♥ HBase</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/09/21/mahout-%e2%99%a5-hbase/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/09/21/mahout-%e2%99%a5-hbase/#comments</comments>
		<pubDate>Mon, 22 Sep 2008 01:39:24 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
				<category><![CDATA[HBase]]></category>
		<category><![CDATA[Mahout]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=69</guid>
		<description><![CDATA[Well, more accurately I&#8217;m using and liking both Mahout and HBase in my work at BiggerBoat, but this is a more fun title and I like using the more obscure HTML character entities.
Anyway, I posted an adapter class to Apache JIRA on Friday for Mahout / HBase integration that allows HTables to effectively be manipulated [...]]]></description>
			<content:encoded><![CDATA[<p>Well, more accurately I&#8217;m using and liking both <a href="http://lucene.apache.org/mahou/">Mahout</a> and <a href="http://hadoop.apache.org/hbase/">HBase</a> in my work at <a href="http://biggerboat.com">BiggerBoat</a>, but this is a more fun title and I like using the more <a href="http://www.htmlcodetutorial.com/characterentities_famsupp_69.html">obscure HTML character entities</a>.</p>
<p>Anyway, I posted an adapter class to Apache JIRA on Friday for Mahout / HBase integration that allows <a href="http://hadoop.apache.org/hbase/docs/current/api/org/apache/hadoop/hbase/client/HTable.html">HTable</a>s to effectively be manipulated as sparse vectors (read through RowResult, write through BatchUpdate) in Mahout using the Vector interface.  Check it out, <a href="https://issues.apache.org/jira/browse/MAHOUT-78">MAHOUT-78</a>.</p>
<p>Looks like Mahout needs to generate some HTML for their javadoc.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/09/21/mahout-%e2%99%a5-hbase/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
