Plugin, template and nagvis gadget for checking system stats via SNMP
Version 1.0

If you find any of these useful, drop me an email and/or visit my
employer's website to see some of the cool stuff we make. :-)
Brent Bice
bbice@sgi.com
http://www.sgi.com/

This plugin can be used to fetch, graph, and visualize (or also warn
about) various system performance stats. It works best with systems
that are running net-snmp agents but will do some minimal monitoring of
systems that support the mib2 host mib also (such as modern versions of
windows).  You can use it with nagios to send notifications about obvious
stuff like Idle CPU usage dropping below certain levels (not especially
useful to me - many of our servers run flat out for days at a time). But
you can also use it to throw alarms about specific sorts of CPU usage such
as System or Nice CPU.  (ie, when you trip over an Oracle bug that causes
System CPU to suddenly spike way above the norm of 10-20 percent and you
have mere minutes to log in and diagnose the problem before the system
becomes unusable - grin). 

   This tarball contains several files:
sgichk_snmp_system.pl - The plugin itself.  It requires memcached to be
running. It uses it to store previous cpu counters and the date/time it
last ran so it knows what the time delta between checks is.

check_snmp_system.php - A nagvis template for generating graphs of the
CPU, Memory, Swap, Paging Activity, and Swap Activity and load averages
graphs.

snmp-sysperf.php - A nagvis gadget for visualizing the CPU, Memory and
load average data on a nagvis dashboard.  It displays CPU and memory usages
with bar charts similar to the old xosview program.  The Load averages are
printed as text.  You must customize this script to tell it where to find
a truetype font file of your choice.  Just change the $font= setting near
the top of the file to point to the absolute path to a true-type font.

Here's some example nagios config stuff:

# Check system perf stats via SNMP
define command{
        command_name    check_snmp_system
        command_line    $USER1$/./sgichk_snmp_system.pl -H $HOSTADDRESS$ -C $USER4$ $ARG1$
        }


# An example of using the command. 
# We always use SNMP v2 or v3 - bulk queries are much faster, so in the 
# example below I'm using -2 to specify v2.  In this example, I don't want
# any notifications about load, cpu, paging, or swapping. I'm just using
# nagios to fetch/graph/visualize these stats. But see the next example
#
# Check system perf stats
define service{
        host_name               test-dbserver
        use                     pnp4nag-service,local-service
        normal_check_interval   1
        service_description     Check System Performance
        check_command           check_snmp_system!-2
        contact_groups          dcounix-email
}

# In this example, a Barracuda BMA, I want to know if the NiceCPU usage
# jumps or the load average goes too high.  Either is a (happily not too
# frequent) indication that something is amiss on our BMA.
#
# When the BMA has "issues" the NICE cpu usage also spikes
#
# Check system perf stats
define service{
        host_name               bma
        use                     pnp4nag-service,local-service
        normal_check_interval   1
        max_check_attempts      15
        service_description     Check System Performance
        check_command           check_snmp_system!-2 --load-warn=15 --load-crit=30 --nice-warn=15 --nice-crit=20
}

# Last example - a windows system that support the host mib but not all
# the cool features of net-snmp.
#
# Check system perf stats
define service {
        host_name               pv-excas1-dc21
        use                     pnp4nag-service,local-service
        normal_check_interval   1
        service_description     Check System Performance
        check_command           check_snmp_system!-2 --use-mib2
}