Clustering and High-Availability

check_hadoop_replication.pl (Advanced Nagios Plugins Collection)

Description:

Checks HDFS replication of blocks

Current Version

Last Release Date

October 3, 2013

Compatible With

  • Nagios 1.x
  • Nagios 2.x
  • Nagios 3.x
  • Nagios XI

Owner


Project Notes
Part of the Advanced Nagios Plugins Collection, download it here: https://github.com/harisekhon/nagios-plugins ./check_hadoop_replication.pl --help Nagios Hadoop Plugin to check various health aspects of HDFS via the Namenode's dfsadmin -report - checks % HDFS space used. Based off an earlier plugin I wrote in 2010 that we used in production for over 2 years. This heavily leverages HariSekhonUtils so code in this file is very short but still much tighter validated - checks HDFS replication of blocks, again based off another plugin I wrote in 2010 around the same time as above and ran in production for 2 years. This code unifies/dedupes and improves on both those plugins - checks HDFS % Used Balance is within thresholds - checks number of available datanodes and if there are any dead datanodes Originally written for old vanilla Apache Hadoop 0.20.x, updated for CDH 4.3 (2.0.0-cdh4.3.0) Recommend you also investigate check_hadoop_cloudera_manager_metrics.pl (disclaimer I work for Cloudera but seriously it's good it gives you access to a wealth of information) usage: check_hadoop_replication.pl [ options ] -s --hdfs-space Checks % HDFS Space used against given warning/critical thresholds -r --replication Checks replication state: under replicated blocks, corrupt blocks, missing blocks. Warning/critical thresholds apply to under replicated blocks. Corrupt and missing blocks if any raise critical since this means there is potentially data loss -b --balance Checks Balance of HDFS Space used % across datanodes is within thresholds. Lists the nodes out of balance in verbose mode -n --nodes-available Checks the number of available datanodes against the given warning/critical thresholds as the lower limits (inclusive). Any dead datanodes raises warning -w --warning Warning threshold or ran:ge (inclusive) -c --critical Critical threshold or ran:ge (inclusive) --hadoop-bin Path to 'hdfs' or 'hadoop' command if not in $PATH --hadoop-user Checks that this plugin is being run by the hadoop user (defaults to 'hdfs', falls back to trying 'hadoop' unless specified) -h --help Print description and usage options -t --timeout Timeout in secs (default: 10) -v --verbose Verbose mode -V --version Print version and exit
Reviews (0) Add a Review
Add a Review

You must be logged in to submit a review.

Thank you for your review!

Your review has been submitted and is pending approval.

Recommend

To:


From:


Thank you for your recommendation!

Your recommendation has been sent.

Project Stats
Rating
0 (0)
Favorites
0
Views
29,516