Continuous state estimation for heterogeneous Hadoop clusters


Event International Workshop on Principles of Diagnosis: DX-2013


Shekhar Gupta
Christian Fritz
Roger Hoover
Technical Publications
August 19th 2013
Hadoop is a popular and extremely successful framework for horizontally scalable distributed computing over large data sets based on the MapReduce framework. We present a monitoring tool for a heterogeneous Hadoop cluster to monitor real time performance of every node in the cluster. The performance of node measured in terms of slowdown. The monitoring tool is designed to help system administrators detect underperforming node(s). Additionally, our tool also helps in identifying which resource (CPU or Disk) in the node is affected by the problem. In its current implementation, Hadoop assumes a homogeneous cluster of compute nodes. This assumption is manifest in Hadoop's scheduling algorithms, but is also crucial to existing approaches for detecting performance issues, which rely on the peer similarity between nodes. It is desirable to enable efficient use of Hadoop on heterogeneous clusters as well as on a virtual/cloud infrastructure, both of which violate the peer-similarity assumption. We have implemented the monitoring tool and present preliminary results on an eight node heterogeneous Hadoop cluster at PARC. We show that using our tool, resource specific performance problems (e.g., CPU contention, disk I/O contention) in a node can be detected by a system administrator.


Gupta, S.; Fritz, C.; Price, R.; Hoover, R.; de Kleer, J.; Witteveen, C. Continuous state estimation for heterogeneous Hadoop clusters. International Workshop on Principles of Diagnosis: DX-2013; 2013 October 1-4; Jerusalem, Israel.

Additional information

Focus Areas

Our work is centered around a series of Focus Areas that we believe are the future of science and technology.

Licensing & Commercialization Opportunities

We’re continually developing new technologies, many of which are available for¬†Commercialization.


PARC scientists and staffers are active members and contributors to the science and technology communities.