Others

check_ganglia

Description:

check_ganglia.pl — A nagios plugin allowing checking of Ganglia (gmetad) XML entries. Supports all arbitrary XML data, standard ganglia metrics, gmetric-introduced points, etc.

Current Version

Last Release Date

June 22, 2009

Compatible With


Project Files
Project Notes
estair@monitor02 libexec$ ./check_ganglia.pl --help Unknown option: help UNKNOWN: HOST not defined. -H hostname/IP: of host to connect to gmetad/gmond on -P Port: to connect to and retrieve XML -O Output method: ('cluster' to dump all info | 'hostcheck' to grab for) -T Targethost: when 'hostcheck', the host to pull data for -M Metric: the 'gmetric' defined value to return exclusively -w warn: int value above which the check will exit in a WARN state -c crit: int value above which the check will exit in a CRITICAL state Use plugin for a specific check-host command, determine if host has checked in to ganglia cluster recently: define command{ command_name check-cluster-host-alive command_line $USER1$/check_ganglia.pl -H $HOSTADDRESS$ -P 8600 -O hostcheck -T localhost -M host_state } ......... # pull data direct from host via XML call, using a specified query string (!!> 100x faster !!) define command{ command_name check_ganglia_host_query command_line $USER1$/check_ganglia.pl -host=localhost -port=8652 -output=hostcheck -cluster=$ARG1$ -targethost=$HOSTALIAS$ -M $ARG2$ } define command{ command_name check_ganglia command_line $USER1$/check_rrd_eli.pl "/var/ganglia/rrds/$HOSTGROUPNAME$/$HOSTNAME$.lucasfilm.com/$ARG1$" sum $ARG2$ $ARG3$ }
Reviews (1) Add a Review
Incomplete, here 's a patch
by harakiri, March 31, 2013

Two things are needed. First of all, you need to run "cpan DateTime::Format::Epoch::Unix" to install this module. If anyone knows an RPM on RHEL/Centos please leave a comment. Next, the biggest problem, the correct exit status when treshhold is reached doesn't work. So I fixed it. I'm not a Perl programmer, so if it looks somewhat crude, please excuse me. Copy/paste the following code and run it as a patch: --- /tmp/check_ganglia.pl 2013-03-21 12:43:48.000000000 +0100 +++ check_ganglia.pl 2013-03-22 08:38:48.574700147 +0100 @@ -14,7 +14,6 @@ # TODO: call $cluster{host} hash directly instead of seeking within it. # TODO: Fix some clusters that don't match host checks... # TODO: add retval matching (range, string, etc) -# TODO: fix warn/crit to measure returned metric # TODO: !!! NEXT !!! call $cluster{host} hash directly instead of seeking within it. # TODO: !!! NEXT !!! better, pass in cluster:host context for direct passing of XML # TODO: !!! NEXT !!! use syntax localhost:8652 TCP @@ -22,6 +21,7 @@ # TODO: !!! NEXT !!! Or, since that requires knowing the CLUSTER, make it option and check the hostname as key # TODO: !!! NEXT !!! for the hash of each cluster found. Still reduces cpu/time drastically # +# 2013-03-22: Tom Kerremans: fixed warn/crit to measure returned metric, removed some obsolete notifications ########### # core modules needed: @@ -72,6 +72,7 @@ exit $ERRORS{'CRITICAL'}; } + sub isnumeric() { my ($x) = @_; @@ -128,14 +129,14 @@ } #/ if hostcheck if (defined($warn)) { - print "WARN definedn"; + #print "WARN definedn"; #if ( ! isnumeric($warn) ) { die "NOT NUMERIC n"; } #die "NOT NUMERIC n" if ( ! isnumeric($warn) ) ; die "## $warn is NOT NUMERIC n" if $_ =~ s/[a-z]//; } if (defined($crit)) { - print "CRIT definedn"; + #print "CRIT definedn"; } } #/ sub processargs @@ -301,9 +302,9 @@ ###: ELI: WTF did I put this in here for?? ### DELETEME -### FUNC: output_match -#sub output_match { -#my $output = shift; +## FUNC: output_match +sub output_match { +my $output = shift; # perform string regex match on retval: # if ( "$output" =~ /.*$match.*/ ) { @@ -314,17 +315,20 @@ # exit 2; # } -## perform range check for warn/crit values: -# if ( "$output" >= "$crit" ) { -# exit 1; -# } elsif ( "$output" >= "$warn" ) { -# exit 2; -# } else { -# exit 0; -# } +# perform range check for warn/crit values: + if ( "$output" >= "$crit" ) { + print "CRITICAL: $metric = $output higher than treshhold of $critn"; + exit 2; + } elsif ( "$output" >= "$warn" ) { + print "WARNING: $metric = $output higher than treshhold of $warnn"; + exit 1; + } else { + print "OK: $metric = $outputn"; + exit 0; + } -#} #/sub -### /FUNC: output_match +} #/sub +## /FUNC: output_match #^^^^ ##: ELI: WTF did I put this in here for?? #^^^^ ## DELETEME @@ -386,14 +390,14 @@ print "UNKNOWN: ($metric) not found in host XML! ","n"; exit $ERRORS{'UNKNOWN'} } else { - print "OK: $metric = $host_metrics{$metric} n"; - exit $ERRORS{'OK'}; + &output_match ($host_metrics{$metric}); } } # /unless ($metric) } else {# /if ($hostname eq) } # /if hostname loop through hash. We've exhausted input data, exit now: + } # /foreach $hostkey # don't exit here, create exit at end of all arrays to be searched (after function exits searching the last hash)



Add a Review

You must be logged in to submit a review.

Thank you for your review!

Your review has been submitted and is pending approval.

Recommend

To:


From:


Thank you for your recommendation!

Your recommendation has been sent.

Project Stats
Rating
4 (1)
Favorites
1
Views
104,849