<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Allen Day's Blog &#187; SGE</title>
	<atom:link href="http://www.spicylogic.com/allenday/blog/category/computing/distributed-systems/sge/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.spicylogic.com/allenday/blog</link>
	<description>♥data♥</description>
	<lastBuildDate>Mon, 21 Jun 2010 23:28:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Thoughts on Hadoop JobTracker/TaskTracker Scheduling</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/09/11/thoughts-on-hadoop-tasktracker-jobtrackerscheduling/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/09/11/thoughts-on-hadoop-tasktracker-jobtrackerscheduling/#comments</comments>
		<pubDate>Fri, 12 Sep 2008 01:07:59 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
				<category><![CDATA[Distributed Systems]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Random musings]]></category>
		<category><![CDATA[SGE]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/?p=67</guid>
		<description><![CDATA[Had a brief, interesting conversation on freenode #hadoop today with Rapleaf Engineer Nathan Marz today about scheduling in Hadoop.
Pretty much supports my sense that scheduling is not Hadoop&#8217;s strong suit.  It&#8217;s really pretty shitty.  Would be great to see some more cross-pollination between the Beowulf (SGE, PBS, Globus) and MapReduce (Hadoop, HBase) communities. [...]]]></description>
			<content:encoded><![CDATA[<p>Had a brief, interesting conversation on freenode #hadoop today with <a href="http://blog.rapleaf.com/2008/06/11/rapleafs-newest-engineer-nathan-marz/">Rapleaf Engineer Nathan Marz</a> today about scheduling in Hadoop.</p>
<p>Pretty much supports my sense that scheduling is not Hadoop&#8217;s strong suit.  It&#8217;s really pretty shitty.  Would be great to see some more cross-pollination between the Beowulf (SGE, PBS, Globus) and MapReduce (Hadoop, HBase) communities.  The former have more mature scheduling, resource management and permissions models.  They don&#8217;t really do a good job thought with providing a framework for distributed, parallel computing at the application level though &#8212; everything is roll-your-own.  Perhaps Hadoop could be integrated as a parallel environment to consume resources from a SGE master [<a href="http://www.spicylogic.com/allenday/blog/2008/09/03/sge-hadoop-integration/">1</a>, <a href="http://www.spicylogic.com/allenday/blog/2008/08/08/hadoop-sge-grid-engine-convergence/">2</a>] rather than managing its own mapper/reducer pools.</p>
<p>A less ambitious scheduler improvement is to modify the way the Hadoop scheduler allocates map/reduce resources.  The main itch I&#8217;m trying to scratch right now has to do with the coupling of map/reduce allocation.  There are some cases where it seems this shouldn&#8217;t be done.  Read the dialog with Nathan below if you care to know more.</p>
<table class="msg-table" border="0">
<tbody>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>is it possible to decouple mapper and reducer slot allocation for jobs?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>i mean, if a job is #1 in the MR queue, but it is not yet ready to reduce, can it be prevented from consuming reducer slots?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-type"><span>|&lt;&#8211;</span></td>
<td class="msg-data" colspan="5"><span>Smokinn has left irc.freenode.net ()</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-type"><span>|&lt;&#8211;</span></td>
<td class="msg-data" colspan="5"><span>savage- has left irc.freenode.net (Read error: 110 (Connection timed out))</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-type"><span>&#8211;&gt;|</span></td>
<td class="msg-data" colspan="5"><span>overlast (<a class="chatzilla-link" href="mailto:n=overlast@19.181.210.220.dy.bbexcite.jp">n=overlast@19.181.210.220.dy.bbexcite.jp</a>) has joined <a class="chatzilla-link" href="irc://irc.freenode.net/%23hadoop">#hadoop</a></span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-type"><span>|&lt;&#8211;</span></td>
<td class="msg-data" colspan="5"><span>overlast has left irc.freenode.net (&#8220;Leaving&#8230;&#8221;)</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>allenday: i think that would be hard&#8230; reducing starts while the mapping is happening (copy stage)</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>nathanmarz, i frequently find that while the reduce has &#8220;started&#8221;, it can just sit there for a long time doing nothing</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>this is most common with nutch</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>so there could be a bunch of other jobs further back in the queue that get starved for reduces b/c the head of the queue is squatting on the slots</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>it just sits there in the reduce phase?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>for sure nutch does, yeah</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>during fetch, when it crawling</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-type"><span>|&lt;&#8211;</span></td>
<td class="msg-data" colspan="5"><span>cutting has left irc.freenode.net (&#8220;Leaving.&#8221;)</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>i see</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>i don&#8217;t have that much familiarity with nutch</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>is it possible to increase the number of reducers?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>yep, but then you can get into i/o trouble later</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>for the job i mean, not the cluster</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>oh</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>it sounds like you propose having these squatters consume minimal # of reducers (e.g. only 1)</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>actually, the opposite</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>let&#8217;s say you have 16 reduce slots</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>and the job i set to use 16 reducers</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>each one of those reducers potentially has to go over a lot of data</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>if the job is instead set to use a lot more reducers, like 100 or something</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>than an individual reducer will go a lot faster</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>and potentially, those freed reduce slots will go to jobs with higher priority</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>ok, so you introduce priority to bump the further back ahead in the queue</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>yea</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>is that settable in jobconf?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>you can set num reducers</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-type"><span>&#8211;&gt;|</span></td>
<td class="msg-data" colspan="5"><span>tobias_au (<a class="chatzilla-link" href="mailto:n=opera@CPE-121-50-201-65.dsl.OntheNet.net">n=opera@CPE-121-50-201-65.dsl.OntheNet.net</a>) has joined <a class="chatzilla-link" href="irc://irc.freenode.net/%23hadoop">#hadoop</a></span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>so let&#8217;s suppose the job that squats on reduce slots gets to the head of the queue. regardless of if it has 16 or 100 reducers configured</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>JobConf#setNumReduceTasks</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>and that it it still in map phase only.  has not begun reducing yet</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>until one of those reduces finishes (i.e. the map has finished) all slots are still filled</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>it&#8217;s only when the first reduce finishes that the job at #2 can take over a reduce slot</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>right</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>yea that&#8217;s true</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>that&#8217;s bad</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>this scheme doesn&#8217;t help until mappers finished</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>you really want this #1 job</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>when it is allocating reducers</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>to have low priority in acquiring the slots</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>right</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>well you don&#8217;t want it to acquire any slots until mappers finish</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>so you give reduce slots to #2, #3, #4, etc.  until everyone who wants slots has them.  then you assign to #1</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>or until #1 is ready&#8230;</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>is it just me or does the queueing system in hadoop kind of suck?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>i am coming here from sun grid which puts a lot of emphasis on this aspect</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>well, the priority system will work if you start job #1 after the other jobs</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>if you start the other jobs after #1 then they will get starved of reducers</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>heh, but the whole reason it is in #1 is because it was submitted first, right?  isn&#8217;t hadoop FIFO wrt jobs?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>if they&#8217;re the same priority</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>so maybe decreasing the reducers job #1 uses is the way to go</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>set it so it doesn&#8217;t use all the reduce slots on the cluster</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>i need to do some research to see if there are jira open for improving the scheduler. or if there are some commercial plugins to improve the scheduling</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>definitely room for improvement, agreed</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>yeah, that was what i thought you meant initially.  it&#8217;s a hack too though, and breaks down when the number of jobs gets large</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>i&#8217;m surprised they are coupled.  do you understand how it works when the mapper hands off to the reducer?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>b/c i don&#8217;t and i need to</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>yes</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>can i get the 2min version?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>the reason the reducers start while the mappers are running is because there&#8217;s some work they can do without all the map data</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>each reducer needs to copy the relevant outputs from all the mappers to its machine</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>this is called the &#8220;copy&#8221; phase and can occur in parallel with mapping</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>ok, i&#8217;ve seen that</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>so what we need is a flag taht indicates there will be no data to copy until maps all finish</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>yea</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>a flag that says not to pipeline the process</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>default behavior is to have the flag off and copy greedily</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>which is like it does now</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>turn the flag on says to wait until upstream map finishes before grabbing a reduce slot and kicking off the copy</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>**all upstream maps</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span><a class="chatzilla-link" href="http://hadoop.apache.org/core/docs/current/hadoop-default.html" target="_content">http://hadoop.apache.org/core/docs/current/hadoop-default.html</a></span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>those are all the hadoop config parameters</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>you might be able to find something in there</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>yeah, i fiind goodies in there every time i read that page<span class="chatzilla-emote-txt"> <img src='http://www.spicylogic.com/allenday/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </span></span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>i am only ~1mo into hadoop</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>here&#8217;s another scheduling related question/issue i&#8217;m having</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>i find that job i/o and cpu usage tend to synchronize after a while</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>b/c if there is a slow moving job in the queue, all the others tend to get jammed behind it</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>have you seen this?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>no, i haven&#8217;t</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>but that&#8217;s interesting</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>it comes back to resource (mis)allocation by the scheduler</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><a class="chatzilla-link" href="irc://irc.freenode.net/nathanmarz,isnick"><span>nathanmarz</span></a></td>
<td class="msg-data" colspan="5"><span>how are you measuring that?</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>it&#8217;s this same issue where jobs will consume all the slots</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>so if you have a slow moving thing blocking all the resources</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>no one else can get past</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>then when the slow moving job finishes, the others all start getting processed very quickly (high cpu load during map), then as they begin to finish there is a flurry of i/o</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>it&#8217;s like congestion on the freeway where one car slams on the breaks it sends this wave of traffic jam behind it</span></td>
</tr>
<tr class="msg">
<td class="msg-timestamp"></td>
<td class="msg-user"><span>allenday</span></td>
<td class="msg-data" colspan="5"><span>assuming the freeway is already close to capacity (not sparse)</span></td>
</tr>
</tbody>
</table>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/09/11/thoughts-on-hadoop-tasktracker-jobtrackerscheduling/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SGE / Hadoop integration</title>
		<link>http://www.spicylogic.com/allenday/blog/2008/09/03/sge-hadoop-integration/</link>
		<comments>http://www.spicylogic.com/allenday/blog/2008/09/03/sge-hadoop-integration/#comments</comments>
		<pubDate>Thu, 04 Sep 2008 00:10:17 +0000</pubDate>
		<dc:creator>allenday</dc:creator>
				<category><![CDATA[Distributed Systems]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[SGE]]></category>

		<guid isPermaLink="false">http://www.spicylogic.com/allenday/blog/2008/09/03/sge-hadoop-integration/</guid>
		<description><![CDATA[Yet another interesting blog post I&#8217;ve found today on integrating Hadoop and Sun Grid Engine.
]]></description>
			<content:encoded><![CDATA[<p>Yet another interesting blog post I&#8217;ve found today on <a href="http://blogs.sun.com/ravee/entry/creating_hadoop_pe_under_sge">integrating Hadoop and Sun Grid Engine</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.spicylogic.com/allenday/blog/2008/09/03/sge-hadoop-integration/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
