Release Notes - ASF JIRA

Release Notes - Hadoop Common - Version 0.10.0 - HTML format

Configure Release Notes

Bug

[HADOOP-546] - Task tracker doesnt generate job.xml in jobcache for some tasks ( possibly for only rescheduled tasks)
[HADOOP-596] - TaskTracker taskstatus's phase doesnt get updated on phase transition causing wrong values displayed in WI
[HADOOP-628] - hadoop hdfs -cat replaces some characters with question marks.
[HADOOP-629] - none of the rpc servers check the protcol name for validity
[HADOOP-673] - the task execution environment should have a current working directory that is task specific
[HADOOP-700] - bin/hadoop includes in classpath all jar files in HADOOP_HOME
[HADOOP-737] - TaskTracker's job cleanup loop should check for finished job before deleting local directories
[HADOOP-738] - dfs get or copyToLocal should not copy crc file
[HADOOP-744] - The site docs are not included in the release tar file
[HADOOP-745] - NameNode throws FileNotFoundException: Parent path does not exist on startup
[HADOOP-752] - Possible locking issues in HDFS Namenode
[HADOOP-764] - The memory consumption of processReport() in the namenode can be reduced
[HADOOP-770] - When JobTracker gets restarted, Job Tracker History doesn't show the jobs that were running. (incomplete jobs)
[HADOOP-774] - Datanodes fails to heartbeat when a directory with a large number of blocks is deleted
[HADOOP-777] - the tasktracker hostname is not fully qualified
[HADOOP-782] - TaskTracker.java:killOverflowingTasks & TaskTracker.java:markUnresponsiveTasks only put the tip in tasksToCleanup queue, they don't update the runningJobs
[HADOOP-786] - PhasedFileSystem should use debug level log for ignored exception.
[HADOOP-792] - Invalid dfs -mv can trash your entire dfs
[HADOOP-794] - JobTracker crashes with ArithmeticException
[HADOOP-802] - mapred.speculative.execution description in hadoop-defauls.xml is not complete
[HADOOP-813] - map tasks lost during sort
[HADOOP-814] - Increase dfs scalability by optimizing locking on namenode.
[HADOOP-818] - ant clean test-contrib doesn't work
[HADOOP-823] - DataNode will not start up if any directories from dfs.data.dir are missing
[HADOOP-824] - DFSShell should become FSShell
[HADOOP-825] - If the default file system is set using the new uri syntax, the namenode will not start
[HADOOP-829] - Separate the datanode contents that is written to the fsimage vs the contents used in over-the-wire communication
[HADOOP-835] - conf not set for the default Codec when initializing a Reader for a record-compressed sequence file
[HADOOP-836] - unit tests fail on windows (/C:/cygwin/... is invalid)
[HADOOP-838] - TaskRunner.run() doesn't pass along the 'java.library.path' to the child (task) jvm
[HADOOP-840] - the task tracker is getting blocked by long deletes of local files
[HADOOP-841] - native hadoop libraries don't build properly with 64-bit OS and a 32-bit jvm
[HADOOP-844] - Metrics messages are sent on a fixed-delay schedule instead of a fixed-rate schedule
[HADOOP-846] - Progress report is not sent during the intermediate sorts in the map phase
[HADOOP-849] - randomwriter fails with 'java.lang.OutOfMemoryError: Java heap space' in the 'reduce' task

New Feature

[HADOOP-454] - hadoop du optionally behave like unix's du -s
[HADOOP-574] - want FileSystem implementation for Amazon S3
[HADOOP-681] - Adminstrative hook to pull live nodes out of a HDFS cluster
[HADOOP-811] - Patch to support multi-threaded MapRunnable

Improvement

[HADOOP-331] - map outputs should be written to a single output file with an index
[HADOOP-371] - ant tar should package contrib jars
[HADOOP-451] - Add a Split interface
[HADOOP-470] - Some improvements in the DFS content browsing UI
[HADOOP-524] - Contrib documentation does not appear in Javadoc
[HADOOP-525] - Need raw comparators for hadoop record types
[HADOOP-571] - Path should use URI syntax
[HADOOP-618] - JobProfile and JobSubmissionProtocol should be public
[HADOOP-619] - Unify Map-Reduce and Streaming to take the same globbed input specification
[HADOOP-621] - When a dfs -cat command is killed by the user, the correspondig hadoop process does not get aborted
[HADOOP-676] - JobClient should print user friendly messages for standard errors
[HADOOP-717] - When there are few reducers, sorting should be done by mappers
[HADOOP-720] - Write a white paper on Hadoop File System Architecture, Design and Features
[HADOOP-756] - new dfsadmin command to wait until safe mode is exited
[HADOOP-763] - NameNode benchmark using mapred is insufficient
[HADOOP-783] - Hadoop dfs -put and -get accept '-' to indicate stdin/stdout
[HADOOP-796] - Node failing tasks and failed tasks should be more easily accessible through jobtracker history.
[HADOOP-804] - Cut down on the "mumbling" in the Task process' stdout/stderr
[HADOOP-806] - NameNode WebUI : Include link to each of datanodes
[HADOOP-837] - RunJar should unpack jar files into hadoop.tmp.dir
[HADOOP-850] - Add Writable implementations for variable-length integer types.
[HADOOP-853] - Move site directories to docs directories

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.

Release Notes - Hadoop Common - Version 0.10.0
                
<h2>        Bug
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-546'>HADOOP-546</a>] -         Task tracker doesnt generate job.xml in jobcache for some tasks ( possibly for only rescheduled tasks)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-596'>HADOOP-596</a>] -         TaskTracker taskstatus&#39;s phase doesnt get updated on phase transition causing wrong values displayed in WI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-628'>HADOOP-628</a>] -         hadoop hdfs -cat   replaces some characters with question marks.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-629'>HADOOP-629</a>] -         none of the rpc servers check the protcol name for validity
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-673'>HADOOP-673</a>] -         the task execution environment should have a current working directory that is task specific
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-700'>HADOOP-700</a>] -         bin/hadoop includes in classpath all jar files in HADOOP_HOME
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-737'>HADOOP-737</a>] -         TaskTracker&#39;s job cleanup loop should check for finished job before deleting local directories
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-738'>HADOOP-738</a>] -         dfs get or copyToLocal should not copy crc file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-744'>HADOOP-744</a>] -         The site docs are not included in the release tar file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-745'>HADOOP-745</a>] -         NameNode throws FileNotFoundException: Parent path does not exist on startup
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-752'>HADOOP-752</a>] -         Possible locking issues in HDFS Namenode
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-764'>HADOOP-764</a>] -         The memory consumption of processReport() in the namenode can be reduced
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-770'>HADOOP-770</a>] -         When JobTracker gets restarted, Job Tracker History doesn&#39;t show the jobs that were running. (incomplete jobs)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-774'>HADOOP-774</a>] -         Datanodes fails to heartbeat when a directory with a large number of blocks is deleted
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-777'>HADOOP-777</a>] -         the tasktracker hostname is not fully qualified
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-782'>HADOOP-782</a>] -         TaskTracker.java:killOverflowingTasks &amp; TaskTracker.java:markUnresponsiveTasks only put the tip in tasksToCleanup queue, they don&#39;t update the runningJobs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-786'>HADOOP-786</a>] -         PhasedFileSystem should use debug level log for ignored exception.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-792'>HADOOP-792</a>] -         Invalid dfs -mv can trash your entire dfs
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-794'>HADOOP-794</a>] -         JobTracker crashes with ArithmeticException
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-802'>HADOOP-802</a>] -         mapred.speculative.execution description in hadoop-defauls.xml is not complete
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-813'>HADOOP-813</a>] -         map tasks lost during sort
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-814'>HADOOP-814</a>] -         Increase dfs scalability by optimizing locking on namenode.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-818'>HADOOP-818</a>] -         ant clean test-contrib doesn&#39;t work
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-823'>HADOOP-823</a>] -         DataNode will not start up if any directories from dfs.data.dir are missing
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-824'>HADOOP-824</a>] -         DFSShell should become FSShell
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-825'>HADOOP-825</a>] -         If the default file system is set using the new uri syntax, the namenode will not start
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-829'>HADOOP-829</a>] -         Separate the datanode contents that is written to the fsimage vs the contents used in over-the-wire communication
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-835'>HADOOP-835</a>] -         conf not set for the default Codec when initializing a Reader for a record-compressed sequence file
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-836'>HADOOP-836</a>] -         unit tests fail on windows (/C:/cygwin/... is invalid)
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-838'>HADOOP-838</a>] -         TaskRunner.run() doesn&#39;t pass along the &#39;java.library.path&#39; to the child (task) jvm
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-840'>HADOOP-840</a>] -         the task tracker is getting blocked by long deletes of local files
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-841'>HADOOP-841</a>] -         native hadoop libraries don&#39;t build properly with 64-bit OS and a 32-bit jvm
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-844'>HADOOP-844</a>] -         Metrics messages are sent on a fixed-delay schedule instead of a fixed-rate schedule
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-846'>HADOOP-846</a>] -         Progress report is not sent during the intermediate sorts in the map phase
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-849'>HADOOP-849</a>] -         randomwriter fails with &#39;java.lang.OutOfMemoryError: Java heap space&#39; in the &#39;reduce&#39; task
</li>
</ul>
            
<h2>        New Feature
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-454'>HADOOP-454</a>] -         hadoop du optionally behave like unix&#39;s du -s
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-574'>HADOOP-574</a>] -         want FileSystem implementation for Amazon S3
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-681'>HADOOP-681</a>] -         Adminstrative hook to pull live nodes out of a HDFS cluster
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-811'>HADOOP-811</a>] -         Patch to support multi-threaded MapRunnable
</li>
</ul>
    
<h2>        Improvement
</h2>
<ul>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-331'>HADOOP-331</a>] -         map outputs should be written to a single output file with an index
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-371'>HADOOP-371</a>] -         ant tar should package contrib jars
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-451'>HADOOP-451</a>] -         Add a Split interface
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-470'>HADOOP-470</a>] -         Some improvements in the DFS content browsing UI
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-524'>HADOOP-524</a>] -         Contrib documentation does not appear in Javadoc
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-525'>HADOOP-525</a>] -         Need raw comparators for hadoop record types
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-571'>HADOOP-571</a>] -         Path should use URI syntax
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-618'>HADOOP-618</a>] -         JobProfile and JobSubmissionProtocol should be public
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-619'>HADOOP-619</a>] -         Unify Map-Reduce and Streaming to take the same globbed input specification
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-621'>HADOOP-621</a>] -         When a dfs -cat command is killed by the user, the correspondig hadoop process does not get aborted
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-676'>HADOOP-676</a>] -         JobClient should print user friendly messages for standard errors
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-717'>HADOOP-717</a>] -         When there are few reducers, sorting should be done by mappers
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-720'>HADOOP-720</a>] -         Write a white paper on Hadoop File System Architecture, Design and Features
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-756'>HADOOP-756</a>] -         new dfsadmin command to wait until safe mode is exited
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-763'>HADOOP-763</a>] -         NameNode benchmark using mapred is insufficient
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-783'>HADOOP-783</a>] -         Hadoop dfs -put and -get accept &#39;-&#39; to indicate stdin/stdout
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-796'>HADOOP-796</a>] -         Node failing tasks and failed tasks should be more easily accessible through jobtracker history.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-804'>HADOOP-804</a>] -         Cut down on the &quot;mumbling&quot; in the Task process&#39; stdout/stderr
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-806'>HADOOP-806</a>] -         NameNode WebUI : Include link to each of datanodes
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-837'>HADOOP-837</a>] -         RunJar should unpack jar files into hadoop.tmp.dir
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-850'>HADOOP-850</a>] -         Add Writable implementations for variable-length integer types.
</li>
<li>[<a href='https://issues.apache.org/jira/browse/HADOOP-853'>HADOOP-853</a>] -         Move site directories to docs directories
</li>
</ul>