<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>

<channel>
	<title>Allen Day's Weblog</title>
	<atom:link href="http://www.spicylogic.com/allenday/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.spicylogic.com/allenday/blog</link>
	<description>A Computational Scientist's View</description>
	<pubDate>Sun, 06 Jul 2008 01:44:01 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
	<language>en</language>
			<item>
		<title>pcoc - Piped Command Output Colorizer</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/07/05/pcoc-piped-command-output-colorizer/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/07/05/pcoc-piped-command-output-colorizer/#comments</comments>
		<pubDate>Sun, 06 Jul 2008 01:37:09 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Administration]]></category>

		<category><![CDATA[Analytics]]></category>

		<category><![CDATA[Perl]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/2008/07/05/pcoc-piped-command-output-colorizer/</guid>
		<description><![CDATA[I&#8217;m frequently monitoring webservers, cache servers, database servers, etc by tailing their log files, e.g.

tail -f /etc/httpd/logs/access_log

I like the &#8211;color option provided by grep, but found it to be too limited (only one allowed, no wildcard support).  After a bit of searching to see if a tool existed for doing arbitrary colorizing, I found
acoc, [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m frequently monitoring webservers, cache servers, database servers, etc by tailing their log files, e.g.</p>

<div class="wp_syntax"><div class="code"><pre class="bash"><span style="color: #c20cb9; font-weight: bold;">tail</span> -f <span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>httpd<span style="color: #000000; font-weight: bold;">/</span>logs<span style="color: #000000; font-weight: bold;">/</span>access_log</pre></div></div>

<p>I like the &#8211;color option provided by grep, but found it to be too limited (only one allowed, no wildcard support).  After a bit of searching to see if a tool existed for doing arbitrary colorizing, I found<br />
<a href="http://www.caliban.org/ruby/acoc.shtml">acoc, the Arbitrary Command Output Colourer</a>.</p>
<p>&#8230;which almost did what I needed, but couldn&#8217;t read from a pipe.  So I wrote pcoc, the Piped Command Output Colorizer.  I&#8217;m only publishing this because I&#8217;ve been using it for about 1 1/2 years, and still find it useful.</p>
<p>Source code at the end of this post.  Here&#8217;s an example that highlights iPhone/iPod user agents and requests with a 500/400/404 HTTP response:</p>

<div class="wp_syntax"><div class="code"><pre class="bash"><span style="color: #c20cb9; font-weight: bold;">tail</span> -f .<span style="color: #000000; font-weight: bold;">/</span>logs<span style="color: #000000; font-weight: bold;">/</span>access_log <span style="color: #000000; font-weight: bold;">|</span> pcoc -f <span style="color: #ff0000;">'(iPod)=bold cyan'</span> -f <span style="color: #ff0000;">'(iPhone)=bold magenta'</span> -f <span style="color: #ff0000;">'<span style="color: #000099; font-weight: bold;">\b</span>(500|404|400)<span style="color: #000099; font-weight: bold;">\b</span>=red on_black'</span></pre></div></div>

<p>Sorry, no screenshots :(.</p>
<p>pcoc source:</p>

<div class="wp_syntax"><div class="code"><pre class="perl"><span style="color: #808080; font-style: italic;">#!/usr/bin/perl</span>
<span style="color: #000000; font-weight: bold;">use</span> strict;
<span style="color: #000000; font-weight: bold;">use</span> Getopt::<span style="color: #006600;">Long</span>;
<span style="color: #000000; font-weight: bold;">use</span> Term::<span style="color: #006600;">ANSIColor</span> <span style="color: #000066;">qw</span><span style="color: #66cc66;">&#40;</span>colored<span style="color: #66cc66;">&#41;</span>;
$<span style="color: #66cc66;">|</span>++;
&nbsp;
<span style="color: #b1b100;">my</span> <span style="color: #0000ff;">%format</span> = <span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#41;</span>;
GetOptions<span style="color: #66cc66;">&#40;</span> <span style="color: #ff0000;">&quot;format|f=s&quot;</span> =<span style="color: #66cc66;">&gt;</span> \<span style="color: #0000ff;">%format</span><span style="color: #66cc66;">&#41;</span>;
&nbsp;
<span style="color: #b1b100;">if</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #66cc66;">!</span> <span style="color: #000066;">keys</span> <span style="color: #0000ff;">%format</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
  <span style="color: #000066;">print</span> <span style="color: #66cc66;">&lt;&lt;</span><span style="color: #ff0000;">&quot;EOF&quot;</span>;
Synopsis:
        pcoc - Piped Command Output Colorizer.  Inspired by acoc.
&nbsp;
Usage:
&nbsp;
        $<span style="color: #cc66cc;">0</span> -f <span style="color: #ff0000;">'&lt;regex1&gt;=&lt;color1&gt;'</span> -f <span style="color: #ff0000;">'&lt;regex2&gt;=&lt;color2&gt;'</span>
&nbsp;
$<span style="color: #cc66cc;">0</span> reads from a <span style="color: #000066;">pipe</span> <span style="color: #b1b100;">and</span> colorizes <span style="color: #000066;">each</span> line based on <span style="color: #000066;">format</span> <span style="color: #66cc66;">&#40;</span>-f<span style="color: #66cc66;">&#41;</span> parameters.
&nbsp;
Arguments:
&nbsp;
-f <span style="color: #ff0000;">'&lt;regex&gt;=&lt;color&gt;'</span>  Required, multiple <span style="color: #000066;">values</span> okay. 
&nbsp;
        <span style="color: #009999;">&lt;regex&gt;</span>: A regular expression from which \$<span style="color: #cc66cc;">1</span> will be colorized
&nbsp;
        <span style="color: #009999;">&lt;color&gt;</span>: One <span style="color: #b1b100;">or</span> more colorization keywords, see perldoc
        Term::<span style="color: #006600;">ANSIColor</span>, but briefly they are:
&nbsp;
        boldness:
                bold
        foreground:
                red yellow green blue magenta cyan black white
        background:
                on_red on_yellow on_green on_blue on_magenta on_cyan
                on_black on_white
&nbsp;
Examples:
&nbsp;
        <span style="color: #808080; font-style: italic;">#highlight the account's shell in bold green</span>
        cat <span style="color: #66cc66;">/</span>etc<span style="color: #66cc66;">/</span>passwd <span style="color: #66cc66;">|</span> $<span style="color: #cc66cc;">0</span> -f <span style="color: #ff0000;">'.+:([^:]+)\$=bold green'</span>
&nbsp;
        <span style="color: #808080; font-style: italic;">#... and the username in red with black background</span>
        cat <span style="color: #66cc66;">/</span>etc<span style="color: #66cc66;">/</span>passwd <span style="color: #66cc66;">|</span> $<span style="color: #cc66cc;">0</span> -f <span style="color: #ff0000;">'([^:]+)=red on_black'</span> -f <span style="color: #ff0000;">'.+:([^:]+)\$=bold green'</span>
&nbsp;
Copyright<span style="color: #66cc66;">/</span>License:
&nbsp;
        Allen Day <span style="color: #66cc66;">&lt;</span>allenday\<span style="color: #0000ff;">@ucla</span>.edu<span style="color: #66cc66;">&gt;</span>, licensed under GPL <span style="color: #cc66cc;">2006</span><span style="color: #cc66cc;">-2008</span>
&nbsp;
EOF
  <span style="color: #000066;">exit</span><span style="color: #66cc66;">&#40;</span><span style="color: #cc66cc;">1</span><span style="color: #66cc66;">&#41;</span>;
<span style="color: #66cc66;">&#125;</span>
&nbsp;
<span style="color: #b1b100;">while</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #b1b100;">my</span> <span style="color: #0000ff;">$line</span> = <span style="color: #66cc66;">&lt;&gt;</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
  <span style="color: #000066;">chomp</span><span style="color: #66cc66;">&#40;</span> <span style="color: #0000ff;">$line</span> <span style="color: #66cc66;">&#41;</span>;
  <span style="color: #b1b100;">foreach</span> <span style="color: #b1b100;">my</span> <span style="color: #0000ff;">$f</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #000066;">keys</span> <span style="color: #0000ff;">%format</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
    <span style="color: #b1b100;">my</span> <span style="color: #0000ff;">@c</span> = <span style="color: #000066;">split</span> <span style="color: #ff0000;">','</span>, <span style="color: #0000ff;">$format</span><span style="color: #66cc66;">&#123;</span> <span style="color: #0000ff;">$f</span> <span style="color: #66cc66;">&#125;</span>;
&nbsp;
    <span style="color: #b1b100;">if</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #0000ff;">$line</span> =~ <span style="color: #000066;">qr</span><span style="color: #66cc66;">/</span><span style="color: #0000ff;">$f</span><span style="color: #66cc66;">/</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
      <span style="color: #b1b100;">while</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #b1b100;">my</span> <span style="color: #66cc66;">&#40;</span> <span style="color: #0000ff;">$s</span>, <span style="color: #0000ff;">$t</span> <span style="color: #66cc66;">&#41;</span> = <span style="color: #0000ff;">$f</span> =~ <span style="color: #000066;">m</span><span style="color: #66cc66;">/</span>^<span style="color: #66cc66;">&#40;</span>.<span style="color: #66cc66;">*</span>?<span style="color: #66cc66;">&#41;</span>\<span style="color: #66cc66;">&#40;</span>+<span style="color: #66cc66;">&#40;</span>.+?<span style="color: #66cc66;">&#41;</span>\<span style="color: #66cc66;">&#41;</span>+<span style="color: #66cc66;">/</span> <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
        <span style="color: #b1b100;">my</span> <span style="color: #0000ff;">$c</span> = <span style="color: #000066;">pop</span> <span style="color: #0000ff;">@c</span> <span style="color: #66cc66;">||</span> <span style="color: #b1b100;">last</span>;
        <span style="color: #0000ff;">$line</span> =~ <span style="color: #000066;">s</span><span style="color: #66cc66;">/</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$s</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">&#40;</span><span style="color: #0000ff;">$t</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">/</span>$<span style="color: #cc66cc;">1</span>.colored<span style="color: #66cc66;">&#40;</span>$<span style="color: #cc66cc;">2</span>,<span style="color: #0000ff;">$c</span><span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">/</span>e;
        <span style="color: #0000ff;">$f</span> =~ <span style="color: #000066;">s</span><span style="color: #66cc66;">/</span>^<span style="color: #66cc66;">&#40;</span>.<span style="color: #66cc66;">*</span>?<span style="color: #66cc66;">&#41;</span>\<span style="color: #66cc66;">&#40;</span><span style="color: #66cc66;">&#40;</span>.+?<span style="color: #66cc66;">&#41;</span>\<span style="color: #66cc66;">&#41;</span><span style="color: #66cc66;">/</span>$<span style="color: #cc66cc;">1</span>$<span style="color: #cc66cc;">2</span><span style="color: #66cc66;">/</span>;
      <span style="color: #66cc66;">&#125;</span>
    <span style="color: #66cc66;">&#125;</span>
  <span style="color: #66cc66;">&#125;</span>
  <span style="color: #000066;">print</span> <span style="color: #ff0000;">&quot;$line<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>;
<span style="color: #66cc66;">&#125;</span></pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/07/05/pcoc-piped-command-output-colorizer/feed/</wfw:commentRss>
		</item>
		<item>
		<title>iPhone 2.0 User-Agent string, other iPhone/iPod data</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/07/04/iphone-20-user-agent-string-other-iphoneipod-data/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/07/04/iphone-20-user-agent-string-other-iphoneipod-data/#comments</comments>
		<pubDate>Sat, 05 Jul 2008 01:53:10 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Informatics]]></category>

		<category><![CDATA[Mobile]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=39</guid>
		<description><![CDATA[I was preparing a report on iPhone locales from some web server logs, and noticed a few oddities.  Some of the hits appear to be coming from the new 3G iPhone 2.0, check out the User-Agent strings:

# observed from 1 metrocast.net (NY) IP
Mozilla/5.0 &#40;iPod; U; iPhone OS 2_0 like Mac OS X; en-us&#41; AppleWebKit/525.17 [...]]]></description>
			<content:encoded><![CDATA[<p>I was preparing a report on iPhone locales from some web server logs, and noticed a few oddities.  Some of the hits appear to be coming from the new 3G iPhone 2.0, check out the User-Agent strings:</p>

<div class="wp_syntax"><div class="code"><pre class="bash"><span style="color: #808080; font-style: italic;"># observed from 1 metrocast.net (NY) IP</span>
Mozilla<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">5.0</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>iPod; U; iPhone OS 2_0 like Mac OS X; en-us<span style="color: #7a0874; font-weight: bold;">&#41;</span> AppleWebKit<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">525.17</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>KHTML, like Gecko<span style="color: #7a0874; font-weight: bold;">&#41;</span> Version<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">3.1</span> Mobile<span style="color: #000000; font-weight: bold;">/</span>5A240d Safari<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">5525.7</span>
<span style="color: #808080; font-style: italic;"># observed from 1 optonline.net (NY) IP</span>
Mozilla<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">5.0</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>iPhone Simulator; U; CPU iPhone OS 2_0 like Mac OS X; en-us<span style="color: #7a0874; font-weight: bold;">&#41;</span> AppleWebKit<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">525.18</span><span style="color: #000000;">.1</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>KHTML, like Gecko<span style="color: #7a0874; font-weight: bold;">&#41;</span> Version<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">3.1</span><span style="color: #000000;">.1</span> Mobile<span style="color: #000000; font-weight: bold;">/</span>5A345 Safari<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">525.20</span></pre></div></div>

<p>The former is confirmed to be an <a href="http://forums.macrumors.com/showthread.php?t=471274">iPhone 2.0 User-Agent string</a> on the MacRumors Forums.</p>
<p>Other unusual/rare iPhone/iPod User-Agent/UA strings:</p>

<div class="wp_syntax"><div class="code"><pre class="bash">Mozilla<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">5.0</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>iPhone; U; CPU like Mac OS X; en<span style="color: #7a0874; font-weight: bold;">&#41;</span> AppleWebKit<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">420.1</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>KHTML, like Gecko<span style="color: #7a0874; font-weight: bold;">&#41;</span> Version<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">3.0</span> Mobile<span style="color: #000000; font-weight: bold;">/</span>4A102 Safari<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">419</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>United States<span style="color: #7a0874; font-weight: bold;">&#41;</span>
Mozilla<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">5.0</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>Windows; U; Windows NT <span style="color: #000000;">5.1</span>; en-US; rv:<span style="color: #000000;">1.9</span><span style="color: #7a0874; font-weight: bold;">&#41;</span> Gecko<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">2008052906</span> Mozilla<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">5.0</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>iPhone; U; CPU like Mac OS X; en<span style="color: #7a0874; font-weight: bold;">&#41;</span> AppleWebKit<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">420</span>+ <span style="color: #7a0874; font-weight: bold;">&#40;</span>KHTML, like Gecko<span style="color: #7a0874; font-weight: bold;">&#41;</span> Version<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">3.0</span> Mobile<span style="color: #000000; font-weight: bold;">/</span>1A543 Safari<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">419.3</span>
Mozilla<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">5.0</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>iPhone; U; CPU like Mac OS X; en<span style="color: #7a0874; font-weight: bold;">&#41;</span> AppleWebKit<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">420.1</span> <span style="color: #7a0874; font-weight: bold;">&#40;</span>KHTML, like Gecko<span style="color: #7a0874; font-weight: bold;">&#41;</span> Cydia<span style="color: #000000; font-weight: bold;">/</span><span style="color: #000000;">1.0</span><span style="color: #000000;">.2460</span><span style="color: #000000;">-59</span></pre></div></div>

<p>Know anything about these?  Leave me a comment!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/07/04/iphone-20-user-agent-string-other-iphoneipod-data/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Notes on setting up Taste</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/06/30/notes-on-setting-up-taste/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/06/30/notes-on-setting-up-taste/#comments</comments>
		<pubDate>Mon, 30 Jun 2008 10:49:55 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Informatics]]></category>

		<category><![CDATA[Java]]></category>

		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/2008/06/30/notes-on-setting-up-taste/</guid>
		<description><![CDATA[

Setting up Taste v1.7.2 on a CentOS 4 x86_64 box.
Taste has merged with Mahout now, but I still want to do this standalone b/c I&#8217;m having trouble getting the JUnit tests to pass for Mahout.  With that out of the way&#8230;
These are the shell commands I assembled after following the Taste Demo guide.

#make sure [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/06/taste.png"><img class="alignright size-full wp-image-36" title="hadoop" src="http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/06/taste.png" alt="" /></a><br />
<a href='http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/06/mahout-logo-82x100.png'><img src="http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/06/mahout-logo-82x100.png" alt="" title="mahout-logo-82x100" width="82" height="100" class="alignright size-full wp-image-38" /></a></p>
<p>Setting up Taste v1.7.2 on a CentOS 4 x86_64 box.</p>
<p>Taste has merged with <a href="http://lucene.apache.org/mahout/">Mahout</a> now, but I still want to do this standalone b/c I&#8217;m having trouble getting the JUnit tests to pass for Mahout.  With that out of the way&#8230;</p>
<p>These are the shell commands I assembled after following the <a href="http://taste.sourceforge.net/#demo">Taste Demo guide</a>.</p>

<div class="wp_syntax"><div class="code"><pre class="bash"><span style="color: #808080; font-style: italic;">#make sure you have ant, and the JDK.  I don't recommend the CentOS stock, get them from Sun/Apache</span>
<span style="color: #808080; font-style: italic;">#download necessary .jar files, sources, data files.  unpack/move them to correct locations.</span>
<span style="color: #c20cb9; font-weight: bold;">wget</span> http:<span style="color: #000000; font-weight: bold;">//</span>internap.dl.sourceforge.net<span style="color: #000000; font-weight: bold;">/</span>sourceforge<span style="color: #000000; font-weight: bold;">/</span>taste<span style="color: #000000; font-weight: bold;">/</span>taste<span style="color: #000000;">-1.7</span><span style="color: #000000;">.2</span>.<span style="color: #c20cb9; font-weight: bold;">zip</span>
<span style="color: #c20cb9; font-weight: bold;">wget</span> http:<span style="color: #000000; font-weight: bold;">//</span>internap.dl.sourceforge.net<span style="color: #000000; font-weight: bold;">/</span>sourceforge<span style="color: #000000; font-weight: bold;">/</span>proguard<span style="color: #000000; font-weight: bold;">/</span>proguard4<span style="color: #000000;">.2</span>.<span style="color: #c20cb9; font-weight: bold;">zip</span>
<span style="color: #c20cb9; font-weight: bold;">wget</span> http:<span style="color: #000000; font-weight: bold;">//</span>www.grouplens.org<span style="color: #000000; font-weight: bold;">/</span>system<span style="color: #000000; font-weight: bold;">/</span>files<span style="color: #000000; font-weight: bold;">/</span>million-ml-data.tar__0.gz
<span style="color: #c20cb9; font-weight: bold;">wget</span> http:<span style="color: #000000; font-weight: bold;">//</span>www.hightechimpact.com<span style="color: #000000; font-weight: bold;">/</span>Apache<span style="color: #000000; font-weight: bold;">/</span>tomcat<span style="color: #000000; font-weight: bold;">/</span>tomcat<span style="color: #000000;">-5</span><span style="color: #000000; font-weight: bold;">/</span>v5<span style="color: #000000;">.5</span><span style="color: #000000;">.26</span><span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>apache-tomcat<span style="color: #000000;">-5.5</span><span style="color: #000000;">.26</span>.<span style="color: #c20cb9; font-weight: bold;">tar</span>.gz
<span style="color: #c20cb9; font-weight: bold;">unzip</span> taste<span style="color: #000000;">-1.7</span><span style="color: #000000;">.2</span>.<span style="color: #c20cb9; font-weight: bold;">zip</span>
<span style="color: #c20cb9; font-weight: bold;">unzip</span> proguard4<span style="color: #000000;">.2</span>.<span style="color: #c20cb9; font-weight: bold;">zip</span>
<span style="color: #c20cb9; font-weight: bold;">tar</span> -xvzf million-ml-data.tar__0.gz
<span style="color: #c20cb9; font-weight: bold;">tar</span> -xvzf apache-tomcat<span style="color: #000000;">-5.5</span><span style="color: #000000;">.26</span>.<span style="color: #c20cb9; font-weight: bold;">tar</span>.gz
<span style="color: #c20cb9; font-weight: bold;">cp</span> proguard4<span style="color: #000000;">.2</span><span style="color: #000000; font-weight: bold;">/</span>lib<span style="color: #000000; font-weight: bold;">/</span>proguard.jar lib<span style="color: #000000; font-weight: bold;">/</span>
<span style="color: #c20cb9; font-weight: bold;">mv</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>mr<span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #000000; font-weight: bold;">*</span>.dat src<span style="color: #000000; font-weight: bold;">/</span>example<span style="color: #000000; font-weight: bold;">/</span>com<span style="color: #000000; font-weight: bold;">/</span>planetj<span style="color: #000000; font-weight: bold;">/</span>taste<span style="color: #000000; font-weight: bold;">/</span>example<span style="color: #000000; font-weight: bold;">/</span>grouplens<span style="color: #000000; font-weight: bold;">/</span>
<span style="color: #808080; font-style: italic;">#start up tomcat on port 8080 (default)</span>
<span style="color: #007800;">JAVA_OPTS=</span><span style="color: #ff0000;">&quot;-server -da -dsa -Xms1024m -Xmx1024m&quot;</span> <span style="color: #007800;">JAVA_HOME=</span><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>java<span style="color: #000000; font-weight: bold;">/</span>jdk1<span style="color: #000000;">.6</span>.0_02 <span style="color: #c20cb9; font-weight: bold;">sh</span> apache-tomcat<span style="color: #000000;">-5.5</span><span style="color: #000000;">.26</span><span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>startup.<span style="color: #c20cb9; font-weight: bold;">sh</span>
<span style="color: #808080; font-style: italic;">#build taste.war, and inject it into tomcat</span>
<span style="color: #007800;">JDK_HOME=</span><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>java<span style="color: #000000; font-weight: bold;">/</span>jdk1<span style="color: #000000;">.6</span>.0_02 <span style="color: #007800;">JAVA_HOME=</span><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>java<span style="color: #000000; font-weight: bold;">/</span>jdk1<span style="color: #000000;">.6</span>.0_02 ant build-grouplens-example
<span style="color: #c20cb9; font-weight: bold;">cp</span> taste.war apache-tomcat<span style="color: #000000;">-5.5</span><span style="color: #000000;">.26</span><span style="color: #000000; font-weight: bold;">/</span>webapps<span style="color: #000000; font-weight: bold;">/</span>
<span style="color: #808080; font-style: italic;">#test the app.  may take a minute or two on the first query.</span>
<span style="color: #c20cb9; font-weight: bold;">wget</span> -O - -S <span style="color: #ff0000;">'http://localhost:8080/taste/RecommenderServlet?userID=1&amp;amp;debug=true'</span></pre></div></div>

<p>Once you get that working, you can tweak the demo slightly to work on another data set.  You just need to know the grouplens file format.  ratings.dat is of the format:</p>
<pre>UserID::MovieID::Rating::Timestamp</pre>
<p>e.g.</p>
<pre>1::1193::5::978300760</pre>
<p>and movies.dat is of the format:</p>
<pre>MovieID::Title::Genres</pre>
<p>e.g.</p>
<pre>1::Toy Story (1995)::Animation|Children's|Comedy</pre>
<p>I wrote a script, let&#8217;s call it load_taste.pl, that can generate new movies.dat and ratings.dat files from an alternate data source.  If I make these new files, I can drop them in place of the grouplens data, rebuild the .war files, and make recommendations on this other data set.  Here&#8217;s how to do it:</p>

<div class="wp_syntax"><div class="code"><pre class="bash"><span style="color: #808080; font-style: italic;">#generate ratings.dat and movies.dat.  move them to replace the grouplens data files.</span>
<span style="color: #c20cb9; font-weight: bold;">perl</span> .<span style="color: #000000; font-weight: bold;">/</span>load_taste.pl
<span style="color: #c20cb9; font-weight: bold;">mv</span> <span style="color: #7a0874; font-weight: bold;">&#91;</span>mr<span style="color: #7a0874; font-weight: bold;">&#93;</span><span style="color: #000000; font-weight: bold;">*</span>.dat src<span style="color: #000000; font-weight: bold;">/</span>example<span style="color: #000000; font-weight: bold;">/</span>com<span style="color: #000000; font-weight: bold;">/</span>planetj<span style="color: #000000; font-weight: bold;">/</span>taste<span style="color: #000000; font-weight: bold;">/</span>example<span style="color: #000000; font-weight: bold;">/</span>grouplens<span style="color: #000000; font-weight: bold;">/</span>
<span style="color: #808080; font-style: italic;">#get rid of stale .war and .jar files</span>
<span style="color: #c20cb9; font-weight: bold;">rm</span> taste.war grouplens.jar
<span style="color: #808080; font-style: italic;">#build the &quot;quick&quot; version of the example.  see below for build.xml patch</span>
<span style="color: #007800;">JDK_HOME=</span><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>java<span style="color: #000000; font-weight: bold;">/</span>jdk1<span style="color: #000000;">.6</span>.0_02 <span style="color: #007800;">JAVA_HOME=</span><span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>java<span style="color: #000000; font-weight: bold;">/</span>jdk1<span style="color: #000000;">.6</span>.0_02 ant build-grouplens-example-quick
<span style="color: #808080; font-style: italic;">#inject the re-built .war file into tomcat.</span>
<span style="color: #c20cb9; font-weight: bold;">cp</span> taste.war apache-tomcat<span style="color: #000000;">-5.5</span><span style="color: #000000;">.26</span><span style="color: #000000; font-weight: bold;">/</span>webapps<span style="color: #000000; font-weight: bold;">/</span>
<span style="color: #808080; font-style: italic;">#get rid of stale tomcat caches</span>
<span style="color: #c20cb9; font-weight: bold;">rm</span> -rf apache-tomcat<span style="color: #000000;">-5.5</span><span style="color: #000000;">.26</span><span style="color: #000000; font-weight: bold;">/</span>webapps<span style="color: #000000; font-weight: bold;">/</span>taste apache-tomcat<span style="color: #000000;">-5.5</span><span style="color: #000000;">.26</span><span style="color: #000000; font-weight: bold;">/</span>temp<span style="color: #000000; font-weight: bold;">/</span>taste.<span style="color: #000000; font-weight: bold;">*</span>.txt</pre></div></div>

<p>Note that I&#8217;ve defined a new ant build target called &#8220;build-grouplens-example-quick&#8221;.  The purpose of this is that we only want to rebuild grouplens.jar and taste.war, not reoptimize/reverify/rebuild taste.jar, etc.  The &#8220;build-grouplens-example&#8221; target takes ~55 seconds to complete on my machine, whereas the &#8220;build-grouplens-example-quick&#8221; target takes ~2 seconds.  Here&#8217;s a diff to the original build.xml file:</p>

<div class="wp_syntax"><div class="code"><pre class="diff"><span style="color: #888822;">--- /tmp/build.xml      <span style="">2008</span><span style="">-03</span><span style="">-21</span> <span style="">21</span>:<span style="">18</span>:<span style="">20.000000000</span> <span style="">-0700</span></span>
<span style="color: #888822;">+++ ./build.xml <span style="">2008</span><span style="">-06</span><span style="">-30</span> <span style="">11</span>:<span style="">46</span>:<span style="">18.000000000</span> <span style="">-0700</span></span>
<span style="color: #440088;">@@ <span style="">-161</span>,<span style="">6</span> <span style="">+161</span>,<span style="">58</span> @@</span>
      &lt;delete file=&quot;$<span style="">&#123;</span>my-web.xml<span style="">&#125;</span>&quot;/&gt;
   &lt;/target&gt;
&nbsp;
<span style="color: #00b000;">+  &lt;target depends=&quot;&quot; name=&quot;build-taste-server-quick&quot; description=&quot;Builds deployable web-based Taste server&quot;&gt;</span>
<span style="color: #00b000;">+     &lt;fail unless=&quot;my-recommender.jar&quot; message=&quot;Please set -Dmy-recommender.jar=XXX&quot;/&gt;</span>
<span style="color: #00b000;">+     &lt;fail unless=&quot;my-recommender-class&quot; message=&quot;Please set -Dmy-recommender-class=XXX&quot;/&gt;</span>
<span style="color: #00b000;">+     &lt;tempfile property=&quot;my-web.xml&quot;/&gt;</span>
<span style="color: #00b000;">+     &lt;copy file=&quot;src/main/com/planetj/taste/web/web.xml&quot; tofile=&quot;$<span style="">&#123;</span>my-web.xml<span style="">&#125;</span>&quot;&gt;</span>
<span style="color: #00b000;">+       &lt;filterset&gt;</span>
<span style="color: #00b000;">+               &lt;filter token=&quot;RECOMMENDER_CLASS&quot; value=&quot;$<span style="">&#123;</span>my-recommender-class<span style="">&#125;</span>&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;/filterset&gt;</span>
<span style="color: #00b000;">+     &lt;/copy&gt;</span>
<span style="color: #00b000;">+     &lt;war destfile=&quot;$<span style="">&#123;</span>release-war<span style="">&#125;</span>&quot; webxml=&quot;$<span style="">&#123;</span>my-web.xml<span style="">&#125;</span>&quot;&gt;</span>
<span style="color: #00b000;">+       &lt;lib dir=&quot;.&quot;&gt;</span>
<span style="color: #00b000;">+               &lt;include name=&quot;$<span style="">&#123;</span>release-jar<span style="">&#125;</span>&quot;/&gt;</span>
<span style="color: #00b000;">+               &lt;include name=&quot;$<span style="">&#123;</span>my-recommender.jar<span style="">&#125;</span>&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;/lib&gt;</span>
<span style="color: #00b000;">+       &lt;lib dir=&quot;lib/axis&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;classes dir=&quot;build&quot;&gt;</span>
<span style="color: #00b000;">+               &lt;include name=&quot;com/planetj/taste/web/**&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;/classes&gt;</span>
<span style="color: #00b000;">+       &lt;fileset dir=&quot;src/main/com/planetj/taste/web&quot;&gt;</span>
<span style="color: #00b000;">+               &lt;include name=&quot;RecommenderService.jws&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;/fileset&gt;</span>
<span style="color: #00b000;">+     &lt;/war&gt;</span>
<span style="color: #00b000;">+     &lt;delete file=&quot;$<span style="">&#123;</span>my-web.xml<span style="">&#125;</span>&quot;/&gt;</span>
<span style="color: #00b000;">+  &lt;/target&gt;</span>
<span style="color: #00b000;">+  &lt;target depends=&quot;&quot; name=&quot;build-grouplens-example-quick&quot; description=&quot;Builds deployable GroupLens example&quot;&gt;</span>
<span style="color: #00b000;">+     &lt;javac source=&quot;<span style="">1.5</span>&quot;</span>
<span style="color: #00b000;">+            target=&quot;<span style="">1.5</span>&quot;</span>
<span style="color: #00b000;">+            deprecation=&quot;true&quot;</span>
<span style="color: #00b000;">+          debug=&quot;true&quot;</span>
<span style="color: #00b000;">+          optimize=&quot;false&quot;</span>
<span style="color: #00b000;">+            destdir=&quot;build&quot;</span>
<span style="color: #00b000;">+            srcdir=&quot;src/example&quot;&gt;</span>
<span style="color: #00b000;">+       &lt;compilerarg value=&quot;-Xlint:all&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;classpath&gt;</span>
<span style="color: #00b000;">+               &lt;pathelement location=&quot;$<span style="">&#123;</span>release-jar<span style="">&#125;</span>&quot;/&gt;</span>
<span style="color: #00b000;">+               &lt;pathelement location=&quot;$<span style="">&#123;</span>annotations.jar<span style="">&#125;</span>&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;/classpath&gt;</span>
<span style="color: #00b000;">+     &lt;/javac&gt;</span>
<span style="color: #00b000;">+     &lt;jar jarfile=&quot;grouplens.jar&quot;&gt;</span>
<span style="color: #00b000;">+       &lt;fileset dir=&quot;src/example&quot;&gt;</span>
<span style="color: #00b000;">+               &lt;include name=&quot;com/planetj/taste/example/grouplens/ratings.dat&quot;/&gt;</span>
<span style="color: #00b000;">+               &lt;include name=&quot;com/planetj/taste/example/grouplens/movies.dat&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;/fileset&gt;</span>
<span style="color: #00b000;">+       &lt;fileset dir=&quot;build&quot;&gt;</span>
<span style="color: #00b000;">+               &lt;include name=&quot;com/planetj/taste/example/grouplens/**&quot;/&gt;</span>
<span style="color: #00b000;">+       &lt;/fileset&gt;</span>
<span style="color: #00b000;">+     &lt;/jar&gt;</span>
<span style="color: #00b000;">+     &lt;property name=&quot;my-recommender.jar&quot; value=&quot;grouplens.jar&quot;/&gt;</span>
<span style="color: #00b000;">+     &lt;property name=&quot;my-recommender-class&quot; value=&quot;com.planetj.taste.example.grouplens.GroupLensRecommender&quot;/&gt;</span>
<span style="color: #00b000;">+     &lt;antcall target=&quot;build-taste-server-quick&quot;/&gt;</span>
<span style="color: #00b000;">+  &lt;/target&gt;</span>
<span style="color: #00b000;">+</span>
   &lt;target depends=&quot;build,optimize&quot; name=&quot;build-grouplens-example&quot; description=&quot;Builds deployable GroupLens example&quot;&gt;
      &lt;javac source=&quot;<span style="">1.5</span>&quot;
             target=&quot;<span style="">1.5</span>&quot;</pre></div></div>

]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/06/30/notes-on-setting-up-taste/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Notes on setting up hadoop on CentOS 5 x86</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/06/20/notes-on-setting-up-hadoop-on-centos-5-x86/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/06/20/notes-on-setting-up-hadoop-on-centos-5-x86/#comments</comments>
		<pubDate>Sat, 21 Jun 2008 06:53:54 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Computing]]></category>

		<category><![CDATA[Java]]></category>

		<category><![CDATA[Science]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=34</guid>
		<description><![CDATA[
Overview
This documents my experience setting up hadoop on three 2-cpu hyperthreaded Centos 5 x86 boxes.
The machines being used are called:

hadoop-0-0
hadoop-0-1
hadoop-0-2

System Setup
Unless otherwise stated, each task was executed on all three machines.
Set up hadoop user
I added a &#8220;hadoop&#8221; user up on each node, and untarred the hadoop source directly in to the /home/hadoop directory.  So there [...]]]></description>
			<content:encoded><![CDATA[<p><a href='http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/06/hadoop.gif'><img src="http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/06/hadoop.gif" alt="" title="hadoop" width="171" height="100" class="alignright size-full wp-image-35" /></a></p>
<h2>Overview</h2>
<p>This documents my experience setting up <a href="http://hadoop.apache.org/core/">hadoop</a> on three 2-cpu hyperthreaded Centos 5 x86 boxes.</p>
<p>The machines being used are called:</p>
<ul>
<li>hadoop-0-0</li>
<li>hadoop-0-1</li>
<li>hadoop-0-2</li>
</ul>
<h2>System Setup</h2>
<p>Unless otherwise stated, each task was executed on all three machines.</p>
<h3>Set up hadoop user</h3>
<p>I added a &#8220;hadoop&#8221; user up on each node, and untarred the hadoop source directly in to the /home/hadoop directory.  So there was a /home/hadoop/conf, /home/hadoop/bin, etc.</p>
<pre>useradd hadoop
passwd hadoop
cd /tmp
wget http://apache.siamwebhosting.com/hadoop/core/hadoop-0.17.0/hadoop-0.17.0.tar.gz
su hadoop
cd ~
tar -xvzf /tmp/hadoop*gz
mv hadoop*/* ./
rmdir hadoop*
exit</pre>
<p>Next, I set up passphrase-less ssh.  On hadoop-0-0, I set up a DSA key:</p>
<pre>su hadoop
ssh-keygen -t dsa
#[...use blank passphrase...]
scp ~/.ssh/id_dsa.pub hadoop-0-1:/tmp/
scp ~/.ssh/id_dsa.pub hadoop-0-2:/tmp/</pre>
<p>Then, on hadoop-0-1 and hadoop-0-2:</p>
<pre>su hadoop
mkdir ~/.ssh
chmod 700 ~/.ssh
cp /tmp/id_dsa.pub ~/.ssh/authorized_keys
chmod 644 ~/.ssh/authorized_keys</pre>
<p>This allows login from hadoop-0-0 to both hadoop-0-1 and hadoop-0-2 without typing a password.  It&#8217;s necessary.</p>
<h3>Install prerequisite software</h3>
<p>ssh and rsync are required.  You can install them like so:</p>
<pre>yum -y install ssh rsync</pre>
<p>I installed <a href="http://java.sun.com/javase/downloads/index.jsp">Java 6 from Sun</a> (it&#8217;s important to mention here that the CentOS yum/JPackage RPMs for gcc-java, etc did not work for Mahout, so it had to be the Sun Java).</p>
<pre>rpm -Uvh j*rpm</pre>
<h3>Setup system services</h3>
<p>I turned off iptables and ip6tables, and disabled them on startup.  You could configure them, but I just turned them off.  They get in the way of the nodes communicating.</p>
<pre>/etc/init.d/iptables stop
/etc/init.d/ip6tables stop
/usr/sbin/ntsysv
#[...]</pre>
<p>Then, I edited the hosts file.  I actually did this later in the process after a bunch of debugging, but it makes sense to do it here.  You need to make sure that the name by which you refer to a host is not associated with the loopback IP address (127.0.0.1).  So your /etc/hosts file should look something like this on hadoop-0-0:</p>
<pre>127.0.0.1    localhost localhost.localdomain
::1    localhost6.localdomain6 localhost6
10.0.0.1    hadoop-0-0</pre>
<p>where the 10.0.0.1 entry is optional if you have some other way to resolve it, e.g. DNS.  The key point is that the hadoop-0-0 name not be associated with 127.0.0.1.</p>
<h3>Configuring hadoop in standalone mode</h3>
<p>I tried using the <a href="http://hadoop.apache.org/core/docs/current/quickstart.html">Hadoop Quick Start</a> guide first.  It was trivial to get working in standalone mode.  The documentation is sufficient, I won&#8217;t discuss it further.</p>
<h3>Configuring hadoop in cluster mode</h3>
<p>I tried using the <a href="http://hadoop.apache.org/core/docs/current/cluster_setup.html">Hadoop Cluster Setup</a> guide next.  It describes there are four types of hadoop services, split across three types of machines:</p>
<ul>
<li>NameNode machine, runs NameNode service</li>
<li>JobTracker machine, runs JobTracker service</li>
<li>Slave machine, runs TaskTracker and DataNode services</li>
</ul>
<p>I wanted to have 2 slave nodes.  So I decided to split it up like:</p>
<ul>
<li>hadoop-0-0 - NameNode and JobTracker</li>
<li>hadoop-0-1 - DataNode and TaskTracker</li>
<li>hadoop-0-2 - DataNode and TaskTracker</li>
</ul>
<p>Here&#8217;s what my config files look like on all three machines:</p>
<p>/home/hadoop/conf/hadoop-env.sh: added one line.</p>
<pre>export JAVA_HOME=/usr/java/jdk1.6.0_06</pre>
<p>/home/hadoop/conf/masters:</p>
<pre>hadoop-0-0</pre>
<p>/home/hadoop/conf/slaves:</p>
<pre>hadoop-0-1
hadoop-0-2</pre>
<p>/home/hadoop/conf/hadoop-site.xml:</p>
<pre>&lt;?xml version="1.0"?&gt;
&lt;?xml-stylesheet type="text/xsl" href="configuration.xsl"?&gt;

&lt;!-- Put site-specific property overrides in this file. --&gt;

&lt;configuration&gt;
&lt;property&gt;
&lt;name&gt;fs.default.name&lt;/name&gt;
&lt;value&gt;hdfs://hadoop-0-0:9000/&lt;/value&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;mapred.job.tracker&lt;/name&gt;
&lt;value&gt;hdfs://hadoop-0-0:9001/&lt;/value&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;dfs.data.dir&lt;/name&gt;
&lt;value&gt;/home/hadoop/data&lt;/value&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;mapred.system.dir&lt;/name&gt;
&lt;value&gt;/home/hadoop/mapred/system&lt;/value&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;mapred.local.dir&lt;/name&gt;
&lt;value&gt;/home/hadoop/mapred/local&lt;/value&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;mapred.tasktracker.map.tasks.maximum&lt;/name&gt;
&lt;value&gt;2&lt;/value&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;
&lt;property&gt;
&lt;name&gt;mapred.tasktracker.reduce.tasks.maximum&lt;/name&gt;
&lt;value&gt;2&lt;/value&gt;
&lt;final&gt;true&lt;/final&gt;
&lt;/property&gt;
&lt;/configuration&gt;</pre>
<p>This defines some services on hadoop-0-0 ports 9000 and 9001, and says that the slaves should get 2 map and 2 reduce tasks each (one for each of my 4 CPUs [remember, I'm hyperthreading]).</p>
<p>Make sure the configuration files are the same on all machines.</p>
<h2>Starting hadoop services</h2>
<p>From here I pretty much followed the tutorial.</p>
<p>First, set up the filesystem for the NameNode service on the NameNode (hadoop-0-0 for me):</p>
<pre>/home/hadoop/bin/hadoop namenode -format</pre>
<p>Next start up HDFS, the distributed filesystem:</p>
<pre>/home/hadoop/bin/start-dfs.sh</pre>
<p>Then start up the JobTracker service on the JobTracker node (hadoop-0-0 for me):</p>
<pre>/home/hadoop/bin/start-mapred.sh</pre>
<p>This also starts up the TaskTracker and DataNode services on all the nodes specified in the /home/hadoop/conf/slaves file.</p>
<p>At this point, doing a:</p>
<pre>ps x</pre>
<p>as the hadoop user on the hadoop-0-{0,1,2} machines will show some java daemons running.  If you run into trouble, there is diagnostic information in the /home/hadoop/logs/*.log files.  These were indispensible in debugging my setup.</p>
<h2>Testing hadoop setup</h2>
<h3>Testing HDFS</h3>
<p>To test HDFS, the distributed file system.  On hadoop-0-0:</p>
<pre>
[hadoop@hadoop-0-0 ~]$ ./bin/hadoop dfs -ls /
Found 1 items
/home   &lt;dir&gt;           2008-06-20 18:58        rwxr-xr-x       hadoop  supergroup
[hadoop@hadoop-0-0 ~]$ ./bin/hadoop dfs -touchz /foo
[hadoop@hadoop-0-0 ~]$ ./bin/hadoop dfs -ls /
Found 2 items
/foo    &lt;r 3&gt;   0       2008-06-20 23:49        rw-r&#8211;r&#8211;       hadoop  supergroup
/home   &lt;dir&gt;           2008-06-20 18:58        rwxr-xr-x       hadoop  supergroup
[hadoop@hadoop-0-0 ~]$ ./bin/hadoop dfs -rm /foo
Deleted /foo
[hadoop@hadoop-0-0 ~]$
</pre>
<h3>Testing MapReduce</h3>
<p>Haven&#8217;t done this yet, except in standalone mode.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/06/20/notes-on-setting-up-hadoop-on-centos-5-x86/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Desktop Tower Defense</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/05/31/desktop-tower-defense/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/05/31/desktop-tower-defense/#comments</comments>
		<pubDate>Sat, 31 May 2008 20:13:26 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Fun]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=33</guid>
		<description><![CDATA[
7872.  It&#8217;s either my most recent score, or a lower bound on the number of minutes I&#8217;ve spent on Desktop Tower Defense.  Take your pick, just don&#8217;t tell my dissertation committee  
Play here, or check my mazes.  Warning, it&#8217;s addictive if you like puzzle/realtime strategy games.
]]></description>
			<content:encoded><![CDATA[<div style="float:right"><img src="http://www.handdrawngames.com/DesktopTD/Maps/5498530.gif" style="width:200px"/></div>
<p>7872.  It&#8217;s either my most recent score, or a lower bound on the number of minutes I&#8217;ve spent on Desktop Tower Defense.  Take your pick, just don&#8217;t tell my dissertation committee <img src='http://www.spicylogic.com/allenday/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /> </p>
<p>Play <a href="http://www.handdrawngames.com/DesktopTD/game.asp">here</a>, or <a href="http://www.handdrawngames.com/DesktopTD/ViewMap.asp?name=allenday">check my mazes</a>.  Warning, it&#8217;s addictive if you like puzzle/realtime strategy games.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/05/31/desktop-tower-defense/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Los Angeles SoC(i)al Tech Scene</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/05/30/los-angeles-social-tech-scene/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/05/30/los-angeles-social-tech-scene/#comments</comments>
		<pubDate>Fri, 30 May 2008 20:37:35 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Business]]></category>

		<category><![CDATA[Computing]]></category>

		<category><![CDATA[Networking]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=32</guid>
		<description><![CDATA[I&#8217;m collating here all the tech events/sites I hear about that are specific to Los Angeles / Southern California.  This post is the result of several conversations I&#8217;ve had, verbally and via email, about the scattered tech  scene in Los Angeles.  Disclaimer: I&#8217;m not the organizer of any of these events, and [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m collating here all the tech events/sites I hear about that are specific to Los Angeles / Southern California.  This post is the result of several conversations I&#8217;ve had, verbally and via email, about the scattered tech  scene in Los Angeles.  Disclaimer: I&#8217;m not the organizer of any of these events, and most of them I&#8217;ve never even attended so I can&#8217;t vouch for the quality.  For now, the events are in the order in which I hear about them.</p>
<p>If you know of a resource I missed leave a comment and I&#8217;ll add it in.</p>
<ul>
<li><a href="http://g33kd1nner.com/">Los Angeles g33k d1nner</a> <a href="http://feeds.feedburner.com/LosAngelesG33kD1nner"><img src="http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/05/images.jpeg" alt="RSS" title="rss" width="36" height="14" class="alignnone size-medium wp-image-31" /></a></li>
<li><a href="<br />
http://barcamp.org/BarCampLosAngeles">BarCamp Los Angeles</a> <a href="http://barcamp.org/rss2.php"><img src="http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/05/images.jpeg" alt="RSS" title="rss" width="36" height="14" class="alignnone size-medium wp-image-31" /></a></li>
<li><a href="http://startuplaevent.com/">StartupLA</a></li>
<li><a href="http://www.DigitalLA.net">Digital LA</a></li>
<li><a href="http://www.mobilemondayla.com/">Los Angeles Mobile Mondays</a></li>
<li><a href="http://www.lunch20.com/">Lunch 2.0</a> <a href="http://www.lunch20.com/feed/"><img src="http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/05/images.jpeg" alt="RSS" title="rss" width="36" height="14" class="alignnone size-medium wp-image-31" /></a></li>
<li><a href="http://www.socaltech.com/">socalTECH</a> <a href="http://www.socaltech.com/news/news.rss"><img src="http://www.spicylogic.com/allenday/blog/wp-content/uploads/2008/05/images.jpeg" alt="RSS" title="rss" width="36" height="14" class="alignnone size-medium wp-image-31" /></a></li>
<li><a href="http://calendars.techvenue.com/cgi-bin/techvenue.pl?CalendarName=Los_Angeles">Los Angeles Business Technology Events Calendar</a></li>
<li><a href="http://www.engineer.ucla.edu/events/index.html#Current">UCLA Engineering Seminars</a></li>
<li><a href="http://latechcalendar.com/">LA Tech Calendar</a></li>
<li><a href="http://losangeles.pm.org/">Los Angeles Perl Mongers</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/05/30/los-angeles-social-tech-scene/feed/</wfw:commentRss>
		</item>
		<item>
		<title>The Construction and Usage of a Microarray Data Warehouse</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/05/27/the-construction-and-usage-of-a-microarray-data-warehouse/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/05/27/the-construction-and-usage-of-a-microarray-data-warehouse/#comments</comments>
		<pubDate>Wed, 28 May 2008 04:40:20 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Random musings]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=30</guid>
		<description><![CDATA[That&#8217;s my dissertation topic, and I&#8217;m defending it Thursday morning.  You can grab the latest PDF here.  I&#8217;ll update this post with my PowerPoint slides when I finish them.  The crux of the work was published last year as Celsius: A community resource for affymetrix microarray data. Genome Biology, 6(8), 2007. [pdf]
Update: [...]]]></description>
			<content:encoded><![CDATA[<p>That&#8217;s my dissertation topic, and I&#8217;m defending it Thursday morning.  You can grab the latest PDF <a href="http://genome.ucla.edu/u/~allenday/thesis.pdf">here</a>.  I&#8217;ll update this post with my PowerPoint slides when I finish them.  The crux of the work was published last year as Celsius: <em>A community resource for affymetrix microarray data</em>. Genome Biology, 6(8), 2007. [<a href="http://genome.ucla.edu/u/~allenday/celsius.pdf">pdf</a>]</p>
<p><b>Update:</b> <a href="http://genome.ucla.edu/u/~allenday/Defense.ppt">Powerpoint Slides</a> are now online.  Also, my oral defense is complete.  Other than filing my defense with the <a href="http://www2.library.ucla.edu/libraries/researchlibrary/index.cfm">UCLA research library</a>, I am now a Doctor of Philosophy, Human Genetics.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/05/27/the-construction-and-usage-of-a-microarray-data-warehouse/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Statistical HTML Content Extraction</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/05/27/statistical-html-content-extraction/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/05/27/statistical-html-content-extraction/#comments</comments>
		<pubDate>Tue, 27 May 2008 08:48:08 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Informatics]]></category>

		<category><![CDATA[Software]]></category>

		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=29</guid>
		<description><![CDATA[Introduction
I&#8217;ve been learning about some of the techniques used by the so-called &#8220;Black Hat SEO&#8221; community for boosting their rankings in search engine results.  Intriguing stuff.  I&#8217;m by no means an expert in this area, but the theory underlying building black-hat pages and networks sure looks like it has a lot to do [...]]]></description>
			<content:encoded><![CDATA[<h2>Introduction</h2>
<p>I&#8217;ve been learning about some of the techniques used by the so-called &#8220;Black Hat SEO&#8221; community for boosting their rankings in search engine results.  Intriguing stuff.  I&#8217;m by no means an expert in this area, but the theory underlying building black-hat pages and networks sure looks like it has a lot to do with <a href="http://en.wikipedia.org/wiki/Network_analysis">my</a> <a href="http://en.wikipedia.org/wiki/Bioinformatics">primary</a> <a href="http://en.wikipedia.org/wiki/Genomics">areas</a> <a href="http://en.wikipedia.org/wiki/Informatics">of</a> <a href="http://en.wikipedia.org/wiki/Machine_learning">interest</a>.</p>
<h2>Generating Unique Content</h2>
<p>One &#8220;Black Hat SEO&#8221; application area is automatically generating HTML pages to improve search engine rankings.  This technique uses a <a href="http://en.wikipedia.org/wiki/Markov_process">Markov process</a> to generate text.  The idea is to build one or more web pages that contain the keywords the SEO is targeting.  The method basically works like this:</p>
<ol>
<li>Assemble a corpus of text to train the model.  For example, <a href="http://www.gutenberg.org/wiki/Main_Page">Project Gutenberg</a></li>
<li>Build an order-N (typically N=2) <a href="http://en.wikipedia.org/wiki/Markov_model">Markov model</a> that captures the state changes in the corpus</li>
<li>Generate text from the model, periodically throwing in some keywords</li>
<li>Link the generated page to some other page to which you want to send traffic</li>
<li>Repeat again from Step 1</li>
</ol>
<p>One problem with this approach &#8212; aside from the fact that the keywords don&#8217;t really fitting in with the flow of the model &#8212; is that the model is trained on inappropriate text.  For instance, suppose you were trying to optimize for keywords:</p>
<ul>
<li>keywords</li>
<li>statistics</li>
<li>Search engine optimization</li>
<li>SEO</li>
<li>Automatic content generation</li>
<li>Automatic content extraction</li>
<li>HTML content extraction</li>
<li>Markov Model</li>
</ul>
<p>&#8230; then you probably wouldn&#8217;t want to train your model on, say, Jane Austen&#8217;s <a href="http://www.gutenberg.org/etext/1342">Pride and Prejudice</a>.</p>
<h2>Improve Generated Text: Use Niche Corpora</h2>
<p>A better thing to do would be to find some nice web pages containing <strong>keywords</strong>, <strong>statistics</strong>, <strong>seo</strong>, <strong>Markov model</strong>, and so on.  That way you&#8217;ll pick up related keywords that you didn&#8217;t initially think of (or weren&#8217;t suggested by your keyword expansion tool), too.</p>
<p>But let&#8217;s face it.  The corpora are going to be in HTML format.  So the question now becomes, <strong>How do I automate the transformation of HTML into plain text for input to the model?</strong>  A few strawman ideas, followed by my remarks:</p>
<ul>
<li>Get an HTML document, and remove all &lt;element/&gt;s. <i>Won&#8217;t work very well.  You end up training on page navigation, footers, headers, etc.</i></li>
<li>Build a site- or software-specific parser (e.g. for Wikipedia, or for Wordpress) to extract the main content.  <i>Scalability and maintenance nightmare.  This is not generalizable to general text extraction.  You&#8217;ll be constantly fixing broken parsers, too.</i></li>
<li>Devise a scoring system that can identify the main content of the page.  <i>Exactly!</i></li>
</ul>
<p>I did find some methods for scoring page fragments, such as the Perl modules <a href="http://search.cpan.org/~jtaverni/HTML-Content-Extractor-0.01/">HTML::Content::Extractor</a> and <a href="http://search.cpan.org/~cselt/HTML-Extract-0.15/">HTML::Extract</a>, and another method described by <a href="http://www.perlmonks.org/?node_id=57631">Nooks</a>.  There are also a few intersting ideas in <a href="http://www2003.org/cdrom/papers/refereed/p583/p583-gupta.html">Gupta&#8217;s WWW2003 paper</a>.</p>
<p>None of that Perl code linked above <em>actually works</em>, but Nooks and Jean Tavernier generally had the right idea.  Basically, they look &#8220;down&#8221; the DOM to find the sub-DOM with the highest text/tag ratio.</p>
<p>The main problem with this approach is that it biases for DOM leaves, or &#8220;twigs&#8221; that are very close to leaves.  You end up having to write special rules for accomodating the idiosyncrosies of each particular page dealt with, and it basically turns back into an HTML parsing exercise.</p>
<p>The other problem, and possibly more significant one from a statistician&#8217;s point of view, is that the ratio is not a well-understood metric for making decisions about what constitutes a &#8220;good&#8221; versus a &#8220;bad&#8221; sub-document.  It would be better to have a p-value&#8230;</p>
<h2>Balls and Urns</h2>
<p>Fortunately, <a href="http://en.wikipedia.org/wiki/Fisher's_exact_test">Fisher&#8217;s exact test</a> can be applied to this problem.  Here&#8217;s how you can apply it, explanation follows.  First, let&#8217;s define some variables:</p>
<ul>
<li><b>X</b>: the total number of words in the whole document.</li>
<li><b>x</b>: the number of words in a sub-document.</li>
<li><b>Y</b>: the total number of &lt;element/&gt;s in the whole document.</li>
<li><b>y</b>: the number of &lt;element/&gt;s in a sub-document.</li>
</ul>
<p>Then, we perform the following algorithm to identify the single best sub-document:</p>

<div class="wp_syntax"><div class="code"><pre class="c">tree; <span style="color: #808080; font-style: italic;">//the HTML tree's root node</span>
minP <span style="color: #66cc66;">=</span> <span style="color: #cc66cc;">1</span>; <span style="color: #808080; font-style: italic;">//minimum p-value observed in the document</span>
subD <span style="color: #66cc66;">=</span> <span style="color: #ff0000;">&quot;&quot;</span>; <span style="color: #808080; font-style: italic;">//sub-document corresponding to minimum p-value</span>
X <span style="color: #66cc66;">=</span> calculatex<span style="color: #66cc66;">&#40;</span>tree<span style="color: #66cc66;">&#41;</span>;
Y <span style="color: #66cc66;">=</span> calculatey<span style="color: #66cc66;">&#40;</span>tree<span style="color: #66cc66;">&#41;</span>;
look<span style="color: #66cc66;">&#40;</span>tree<span style="color: #66cc66;">&#41;</span>;
<span style="color: #000000; font-weight: bold;">function</span> look <span style="color: #66cc66;">&#40;</span>node<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
  x <span style="color: #66cc66;">=</span> calculatex<span style="color: #66cc66;">&#40;</span>node<span style="color: #66cc66;">&#41;</span>;
  y <span style="color: #66cc66;">=</span> calculatey<span style="color: #66cc66;">&#40;</span>node<span style="color: #66cc66;">&#41;</span>;
  p <span style="color: #66cc66;">=</span> calculateHyperG<span style="color: #66cc66;">&#40;</span>x,y,X,Y<span style="color: #66cc66;">&#41;</span>;
  <span style="color: #b1b100;">if</span> <span style="color: #66cc66;">&#40;</span> p &lt; minP <span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
    minP <span style="color: #66cc66;">=</span> p;
    subD <span style="color: #66cc66;">=</span> node;
  <span style="color: #66cc66;">&#125;</span>
  C <span style="color: #66cc66;">=</span> children<span style="color: #66cc66;">&#40;</span>node<span style="color: #66cc66;">&#41;</span>;
  foreach <span style="color: #66cc66;">&#40;</span>c in C<span style="color: #66cc66;">&#41;</span> <span style="color: #66cc66;">&#123;</span>
    look<span style="color: #66cc66;">&#40;</span>c<span style="color: #66cc66;">&#41;</span>;
  <span style="color: #66cc66;">&#125;</span>
<span style="color: #66cc66;">&#125;</span></pre></div></div>

<h2>Balls and Urns, Explained</h2>
<p>The pseudocode above is examining each sub-document of the HTML document in turn and identifying the one with the smallest p-value.  The p-value is calculated using the <a href="http://en.wikipedia.org/wiki/Hypergeometric_distribution">hypergeometirc distribution</a>, where we consider that a sub-document has <b>x</b> words and <b>y</b> HTML &lt;element/&gt;s.  This, in the context of the total document having <b>X</b> words and <b>Y</b> HTML &lt;element/&gt;s.  It&#8217;s better than a simple ratio calculation because it does not bias for the tree&#8217;s leaves.  That is, the p-value does not consider only the size of <b>x+y</b>.</p>
<h3>Caveats</h3>
<p>Bear in mind that testing so many sub-documents, especially for very large HTML documents, warrants so-called &#8220;<a href="http://en.wikipedia.org/wiki/Multiple_comparisons">multiple hypothesis testing correction</a>&#8220;, such as a <a href="http://en.wikipedia.org/wiki/Bonferroni_correction">Bonferroni correction</a>.  It&#8217;s outside the scope of this article.</p>
<p>Also, the tests performed are not entirely independent.  That is, if node B is a child of node A then B will have some effect on A when calculating A&#8217;s p-value and must be factored out.  This is also a well-defined problem but is, alas, also outside the scope of this article.  Do your homework! <b>Hint</b>: learn about the <a href="http://en.wikipedia.org/wiki/Gene_ontology">Gene Ontology</a>.</p>
<h2>Conclusion</h2>
<p>Fine and dandy, but does it work?  My conclusion: seems to work.  Here&#8217;s a CGI script demonstrating the <a href="http://www.spicylogic.com/allenday/cgi-bin/hyperG.cgi?u=http://www.cnn.com">hypergeometric content extraction</a> technique on CNN.com.  It reports a text snippet at the beginning and end of the single &#8220;best&#8221; sub-document and the corresponding (uncorrected) p-value.  Twiddle the <b>u</b> parameter to test on a page of your choice.  Some pages may block the user-agent I&#8217;m using&#8230;</p>
<p>There is also the issue of what to consider an element and what not to&#8230; or maybe even element weighting.  For instance, maybe &lt;p/&gt; and &lt;i/&gt; elements shouldn&#8217;t be penalized because they&#8217;re commonly associated with text, but &lt;script/&gt; elements are heavily penalized.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/05/27/statistical-html-content-extraction/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Culver City Barking Dogs, Los Angeles Barking Dogs, California Barking Dogs</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/05/21/culver-city-barking-dogs-los-angeles-barking-dogs-california-barking-dogs/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/05/21/culver-city-barking-dogs-los-angeles-barking-dogs-california-barking-dogs/#comments</comments>
		<pubDate>Wed, 21 May 2008 19:12:32 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[Random musings]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=28</guid>
		<description><![CDATA[I&#8217;ve been collating information on laws for the area where I live that relate to barking dogs.  If you&#8217;re in Culver City, Los Angeles County, or the state of California, some or all of this should be useful to you.  If you also have a Home Owners&#8217; Association there is probably also wording [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been collating information on laws for the area where I live that relate to barking dogs.  If you&#8217;re in Culver City, Los Angeles County, or the state of California, some or all of this should be useful to you.  If you also have a Home Owners&#8217; Association there is probably also wording in your CC&amp;Rs or Bylaws that prohibit annoying noises and describe the powers of enforcement given to the HOA.</p>
<p>These links are also more generally useful on local laws:</p>
<p><a href="http://www.amlegal.com/nxt/gateway.dll/California/culver/themunicipalcodeofthecityofculvercitycal?f=templates$fn=default.htm$3.0$vid=amlegal:culvercity_ca">Culver City Code</a></p>
<p><a href="http://municipalcodes.lexisnexis.com/codes/lacounty/">Los Angeles County Code</a></p>
<p><a href="http://www.leginfo.ca.gov/calaw.html">California Code</a></p>
<p>I also found <a href="http://www.scvsheriff.com/rec_ask_sgt_harris.asp?page=11">this page</a> and <a href="http://friendsofculvercityanimals.org/animal_control.html">this page</a> to be useful.</p>
<p>Below are the specific sections of the relevant codes that apply to barking dogs:</p>
<p><strong>Culver City § 9.07.030 ANIMALS AND FOWL</strong><br />
Any animal or fowl which emanates sound or outcry in an excessive, continuous, or untimely fashion, shall be considered a public nuisance and is subject to abatement pursuant to Chapter 9.04 of the Culver City Municipal Code.</p>
<p>(&#8217;65 Code, § 23-44.6) (Ord. No. 95-004 § 2 (part))</p>
<p><strong>Culver City § 9.01.035 ANIMAL ANNOYANCE PROHIBITED</strong><br />
It shall be unlawful for any person to harbor or keep any animal, bird or fowl which disturbs the peace or causes annoyance or disturbance to the neighborhood or reasonably interferes with the peace, comfort or repose of any person or persons in the quiet enjoyment of his or their property, by repeated or continuous barking, howling, whining, or making other sounds common to their species, between the hours of 10:00 p.m. and 8:00 a.m. and such disturbance shall be deemed to constitute the maintenance of a nuisance. Provided, however, that the prohibitions contained in this Section shall not apply to a licensed kennel owner or hospital or other place in which animals, birds or fowl are kept pursuant to a license or permit issued by governmental agencies.</p>
<p>(&#8217;65 Code, § 5-7) (Ord. No. CS-415 § 5-17; Ord. No. CS-24 § 2(d))</p>
<p><strong>Los Angeles County Code § 13.45.010 Loud, unnecessary and unusual noise</strong><br />
Notwithstanding any other provisions of this chapter and in addition thereto, it shall be unlawful for any person to wilfully make or continue, or cause to be made or continued, any loud, unnecessary, and unusual noise which disturbs the peace or quiet of any neighborhood or which causes discomfort or annoyance to any reasonable person of normal sensitiveness residing in the area. The standard which may be considered in determining whether a violation of the provisions of this section exists may include, but not be limited to, the following:<br />
A. The level of noise;<br />
B. Whether the nature of the noise is usual or unusual;<br />
C. Whether the origin of the noise is natural or unnatural;<br />
D. The level and intensity of any background noise;<br />
E. The proximity of the noise to residential sleeping facilities;<br />
F. The nature and zoning of the area within which the noise emanates;<br />
G. The density of the inhabitation of the area within which the noise emanates;<br />
H. The time of the day or night the noise occurs;<br />
I. The duration of the noise;<br />
J. Whether the noise is recurrent, intermittent, or constant; and<br />
K. Whether the noise is produced by a commercial or noncommercial activity. (Ord. 2001-0075 § 1 (part), 2001.)</p>
<p><strong>Los Angeles County Code § 13.45.020 Penalty</strong><br />
Any person violating this chapter is guilty of a misdemeanor punishable by a fine or by imprisonment no more than six months, or both. The fines imposed under this chapter are as follows:<br />
A. A fine of not more than $100.00 for a first violation;<br />
B. A fine of not more than $200.00 for a second violation of the same provision of this ordinance within one year;<br />
C. A fine of not more than $500.00 for each additional violation of the same provision of this ordinance within one year. (Ord. 2001-0075 § 1 (part), 2001.)</p>
<p><strong>Los Angeles County Code § 10.40.065 Public nuisance</strong><br />
A. Any animal (or animals) which molests passersby or passing vehicles, attacks other animals, trespasses on school grounds, is repeatedly at large, damages and or trespasses on private or public property, barks, whines or howls in a continuous or untimely fashion, shall be considered a public nuisance.</p>
<p>B. Every person who maintains, permits or allows a public nuisance to exist upon his or her property or premises, and every person occupying or leasing the property or premises of another and who maintains, permits or allows a public nuisance as described above to exist thereon, after reasonable notice in writing from the department of animal care and control has been served upon such person to cease such nuisance, is guilty of a misdemeanor. The existence of such nuisance for each and every day after the service of such notice shall be deemed a separate and distinct offense. (Ord. 2000-0075 § 54, 2000: Ord. 85-0204 § 24, 1985.)</p>
<p><strong>California Penal Code § 373A:</strong><br />
Every person who maintains, permits, or allows a public nuisance to exist upon his or her property or premises, and every person occupying or leasing the property or premises of another who maintains, permits or allows a public nuisance to exist thereon, after reasonable notice in writing from a health officer or district attorney or city attorney or prosecuting attorney to remove, discontinue or abate the same has been served upon such person, is guilty of a misdemeanor, and shall be punished accordingly; and the existence of such nuisance for each and every day after the service of such notice shall be deemed a separate and distinct offense, and it is hereby made the duty of the district attorney, or the city attorney of any city the charter of which imposes the duty upon the city attorney to prosecute state misdemeanors, to prosecute all persons guilty of violating this section by continuous prosecutions until the nuisance is abated and removed.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/05/21/culver-city-barking-dogs-los-angeles-barking-dogs-california-barking-dogs/feed/</wfw:commentRss>
		</item>
		<item>
		<title>Configure Wordpress Ping</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/05/19/configure-wordpress-ping/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/05/19/configure-wordpress-ping/#comments</comments>
		<pubDate>Mon, 19 May 2008 19:23:14 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
		
		<category><![CDATA[PHP]]></category>

		<category><![CDATA[Random musings]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/2008/05/19/configure-wordpress-ping/</guid>
		<description><![CDATA[I wanted to configure Wordpress pinging for the Facebook Flog Blog application.  For some reason the feed on my profile page isn&#8217;t updating, and I thought maybe this would do the trick.
Took a bit of digging, but I found a guide at Technorati.  Hint: &#8220;options&#8221; has been (moved and) renamed as &#8220;settings&#8221; as [...]]]></description>
			<content:encoded><![CDATA[<p>I wanted to configure Wordpress pinging for the Facebook <a href="http://apps.facebook.com/flogblog">Flog Blog</a> application.  For some reason the feed on my profile page isn&#8217;t updating, and I thought maybe this would do the trick.</p>
<p>Took a bit of digging, but I found a guide at <a href="http://technorati.com/developers/ping/wordpress.html">Technorati</a>.  Hint: &#8220;options&#8221; has been (moved and) renamed as &#8220;settings&#8221; as late as Wordpress 5.2.1.</p>
<p>Let&#8217;s see if it works!</p>
<p><strong>Update</strong>: just by visiting the Flog Blog settings page, I have somehow managed to get Flog Blog to update.  Hmm&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/05/19/configure-wordpress-ping/feed/</wfw:commentRss>
		</item>
	</channel>
</rss>
