
Grid Computing
check_griderrors
Description:
Check errors (and performance) of queues in SGE installations
Current Version
0.1
Last Release Date
2014-03-14
Compatible With
- Nagios 3.x
Owner
License
MIT
Project Notes
This plugin checks the status of all queues of an SGE installation, warns when a defined number of queues have errors and logs performance data. The host where the script is being executed needs to be a submit or an admin
host of the SGE installation. NRPE is recommended in this case.
============ SETUP NOTES ===============
Use NRPE.
Copy this file to a submit or admin host of your installation.
Adapt the paths of installation directly below this comment.
You will need to set the base path of your SGE installation, the name of your
SGE cell and the path to the common environment settings file of your
installation.
If you have overlapping queues (i.e. some nodes belong to more than one queue)
You will have to ignore some of them, otherwise the total sums of cores will
be wrong.
Example to test, run on submit host:
./check_griderrors.sh -w 1 -c 2
============= SETUP EXAMPLES =============
define command{
command_name check_griderrors.sh
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_griderrors
}
nrpe.cfg:
command[check_griderrors]=/usr/lib64/nagios/plugins/check_griderrors.sh -w 1 -c 2
define service{
use generic-service
host_name submithost01
service_description Check Griderrors
check_command check_griderrors
normal_check_interval 3
retry_check_interval 1
}
Reviews
(0)
Add a Review
Page Sections
Project Stats
Rating
0 (0)
Favorites
0
Views
24,122