Grid Computing

check_griderrors

Description:

Check errors (and performance) of queues in SGE installations

Current Version

0.1

Last Release Date

2014-03-14

Compatible With

  • Nagios 3.x

License

MIT


Project Notes
This plugin checks the status of all queues of an SGE installation, warns when a defined number of queues have errors and logs performance data. The host where the script is being executed needs to be a submit or an admin host of the SGE installation. NRPE is recommended in this case. ============ SETUP NOTES =============== Use NRPE. Copy this file to a submit or admin host of your installation. Adapt the paths of installation directly below this comment. You will need to set the base path of your SGE installation, the name of your SGE cell and the path to the common environment settings file of your installation. If you have overlapping queues (i.e. some nodes belong to more than one queue) You will have to ignore some of them, otherwise the total sums of cores will be wrong. Example to test, run on submit host: ./check_griderrors.sh -w 1 -c 2 ============= SETUP EXAMPLES ============= define command{ command_name check_griderrors.sh command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c check_griderrors } nrpe.cfg: command[check_griderrors]=/usr/lib64/nagios/plugins/check_griderrors.sh -w 1 -c 2 define service{ use generic-service host_name submithost01 service_description Check Griderrors check_command check_griderrors normal_check_interval 3 retry_check_interval 1 }
Reviews (0) Add a Review
Project Stats
Rating
0 (0)
Favorites
0
Views
24,122