README for NAGIOSTAT release 1.1 I released this file alone since i got a lot of questions that may be answered by reading this... //Carl **PREFACE Nagiostat, Copyright Carl Bingel 2004. The whole software package is released under Gnu Public License. See included GPL.txt for license-details. Author can be reached at: bingel@users.sourceforge.net For support, see forum and mailinglist on sourceforge (http://www.sf.net/projects/nagiostat). Nagios and the Nagios logo are registered trademarks of Ethan Galstad. **WHAT NAGIOSTAT DOES NagioStat is a small "glue"-program that sits between Nagios and RRD-tool. Nagios is a network-monitor application that can supervise network attached devices (computers, switches, routers and so on) and poll them for different kind of data (. Based on trigger-levels it issues alarms when the value polled from the device is out of bounds. However, Nagios has no built in support for storing the polled performance-values for trending and graphing. See http://www.nagios.org rrdtool is a neat package for storing values in a round-robin-database and generating graphs thereof. For more information, see http://people.ee.ethz.ch/~oetiker/webtools/rrdtool/ **INSTALLING 1. Unpack the tar-file to a directory, for example '/usr/local/nagios/nagiostat'. The nagiostat-base-dir does not have to be within the nagios-installation directory. 2. Start by editing the $BASE_DIR parameter in the file called "nagiostat", which is the main script. Set the value to the directory where you unpacked the files. Create a file called "debug.log" ('$ touch debug.log') in this dir and change permissions so that the nagios-user may write to it. This is the logfile for nagiostat which can be very helpful when debugging the regular-expression thing. If things are installed in the directory specified in step 1, you can also run "make install" to create debug.log, archives-directory and set permissions on them. "make install" is under development. It also assumes that the nagios-user is called 'nagios'. 3. Then read the nagios.conf-file where also most of the documentation is located and change the parameters to your liking. 4. Then set up your webserver so that you can access the nagiostat.cgi (symbolic-link to 'nagiostat') from some URL. Example for apache: --- Alias /nagiostat/ /usr/local/nagios/nagiostat/ Options +ExecCGI Order allow,deny Allow from all --- You also have to make sure that the apache-user has read-rights to the RRDArchivePath and the config-file nagios.conf. You also would probably add additional security to this, password protecting for instance. 5. Set up nagios to enable performance-data handling and that it should send the perfdata to nagiostat. In nagios.cfg, set the parameter 'process_performance_data=1'. This enables processing of performance data. Then add the parameter 'service_perfdata_command=service-perf-data-handler' also to nagios.cfg. This tells nagios to run the "service-perf-data-handler" command to process the performance data after each plugin has been executed. Then add the following to the checkcommands.cfg-file (or whatever nagios-config-file you find suitable): --- ## ## PERF-DATA-HANDLER ## define command { command_name service-perf-data-handler command_line /usr/local/nagios/nagiostat/nagiostat -p "$LASTCHECK$|!!|$HOSTNAME$|!!|$SERVICEDESC$|!!|$SERVICESTATE$|!!|$OUTPUT$|!!|$PERFDATA$" } --- Alter the path to nagiostat to reflect where you installed the files in step 1. 6. To add a nifty icon to click on in the "service detail"-page in the nagios web interface, you can add something like the following to the serviceextinfo.cfg-file: --- define serviceextinfo { host_name yer-host-name service_description PING notes_url /nagiostat/nagiostat.cgi?graph_name=yer-host-name icon_image graph.gif icon_image_alt View graphs } --- NOTE: If you don't have a serviceextinfo.cfg, you have to create one and then add the following line to cgi.cfg: "xedtemplate_config_file=/usr/local/nagios/etc/serviceextinfo.cfg" See nagios documentation for further details on this subject. Copy graph.gif to /usr/local/nagios/share/images/logos (or whichever directory your nagios-installation is in) to get the little ugly graph-icon in the right place. 7. DONE! ******************************************* ** FAQ, tips & trouble shooting ******************************************* 1. Check web server error-output If you're having problems with graphs not showing (i.e. browser shows broken image instead of graph), it is probably due to problems executing the 'rrdtool graph'-command. See the debug output logfile from your webserver for further info on the problem (this is usually found in a file called something like '/var/log/httpd/error_log'). 2. Problem with rrdtool commandline parameters Another problem encountered is that the example nagios.conf-file included in the distribution is calling rrdtool graph with part of the commandline looking like this 'GPRINT:rta:MAX:\"Roundtrip MAX\\: %.4lgms\"', whereas the '%.4lg' generates an error in older versions of rrdtool since the formatter 'lg' didn't appear since quite recently. Solution is to either update your rrdtool version or replace 'lg' with 'lf'. 3. Reading and understanding debug.log Apart from the error_log-output in your web server, always look in the debug.log-file (turn on DEBUGLEVEL=3) in nagiostat for maximum logging. A sample of a successful data insertion should look something like this: --- Sat Jul 17 12:57:07 2004 **INCOMING PERFDATA: LASTCHECK=1090061817 HOSTNAME=itknsgw1 SERVICEDESCR="if-traffic" SERVICESTATE="OK" OUTPUT="OK: rate[IN]=230 kbit/s OK: rate[OUT]=50 kbit/s" PERFDATA="" +VALUE: 230 +VALUE: 50 =INSERT into 'itknsgw1-ifrate.rrd': 230,50 DSA-names=in,out !RRDCMDLINE: /usr/local/bin/rrdtool update /usr/local/nagios/nagiostat/archives/itknsgw1-ifrate.rrd --template in:out 1090061817:230:50 --- The first 6 lines (LASTCHECK, HOSTNAME, SERVICEDESCR, SERVICESTATE, OUTPUT, PERFDATA) are the parameters that gets sent to nagiostat from nagios. HOSTNAME and SERVICEDESCR are the fields that the Hostregex and Serviceregex in the InsertValue-statements (in nagiostat.conf) are used on och corresponds to. OUTPUT is the textual output (the same as seen in the "Status Information"-column in the "Service Detail"-page of nagios web interface). PERFDATA is the performance data returned by the nagios check plugin (i.e. check_ping or check_snmp). Not many plugins return any PERFDATA yet, why you have to extract the data from the OUTPUT-field instead. OUTPUT and PERFDATA is the data which nagiostat applies the ValueRegexTemplates on to extract the values to insert into the rrd-archive. The '+VALUE:'-lines shows that the ValueRegexTemaplate succeeded in extracting the data from the OUTPUT or PERFDATA-fields. The '=INSERT into'-line shows how nagiostat plans to insert the data, in which rrd-archive and in which DSA. The '!RRDCMDLINE:' shows the exact command-line with which rrdtool is called. Unfortunately, any error from this call is not logged in debug.log yet, so if you suspect anything fishy is going on there, simply cut the commandline from debug.log and paste it in a shell and press enter. Then you should see the exact error (if any) that rrdtool complaints about. 4. When are the rrd-archives created? They are created on the fly if nagiostat notices there doesn't exist a file with the specified filename. It is important that the nagios-user has write-permissions to the RRDArchivePath.