' check_smb_speed ' ' ' this is a nagios script that will check read or write speed over SMB. ' ' ' ' This script doesn't take into account the clearing of all the caches where previously written data can reside. ' Think about how appropriate this script is in your situation and how large your cache is. ' ' A useful example is: ' I want to perform two operations: ' - write once every 5 minutes to a file server get the time (starting at time 0 minutes) ' - read once every 5 minutes from a file server, get the time (starting at time 2.5 minutes) ' Because I know that the I/O on this server is large, it is likely that all of the caches** are cleared by 2.5 minutes. ' ** caches could include: Client SMB, client NIC buffer, server NIC buffer, Server SMB, Server hdd driver cache, Server RAID controller write cache, Server RAID HDDs individual cache ' ' These are just considerations and are probably not applicable to your situation. ' This is probably a 100% acceptable test unless you're in an extremely high frequency area. ' ' Writes and reads are in 64KB chunks because that's about the size of SMB 1.0 flush size ("block size"). Search for FileChunk to adjust this. ' Deleting files is never counted towards either total operation time. ' performance notes: ' any fso method is slow. ' Hence, fso.fileexists to check if a file exists before it is delete has been removed. ' Deletion is forced, and the files are always deleted and re-written during the READ or WRITE operations. ' If the files don't exist, the fso.deletefile will cause an error, so On Error Resume Next in the local DeleteFile() function. ' ' These deletions are never counted towards the final timer of the operation (READ or WRITE). ' See the end of this script for a dialogue about cache utilization, and scheduling checks. ' ' security concerns: ' You must assign a service account for your NSClient++ process to run under, for it to execute the script, who has access to the target SMB share. ' I suggest you make this a specific account on the computer where NSClient++ is running and the target computer, or a least privilege account on your domain. ' The user need only have read/write/delete access to SMB share, and SeServiceLogonRight on the NSClient++ host computer. ' Note that if you do this right, the user won't be able to write a log file. I suggest you touch then chown the log file to the service user. ' ' Make sure you will not suffer from command injection by utilizing allowed_hosts and only allowing certain hosts through the Windows firewall (and, if you want to, implement a HIDS). ' This script *CAN* actually cause a denial of service by utilizing system resources. ' When you start an SMB session, a part of memory from a memory pool called the non-paged pool is allocated to this task. ' Multiply this task by any number and the non-paged pool starts getting eaten up. This can cause your system to not be able to allocate resources to new tasks. (see Event ID 2000, 2019, 2020 from source srv; http://go.microsoft.com/fwlink/?LinkId=83250 /?LinkId=83251 /?LinkId=83252) ' ' Solution: use allowed_hosts, use a firewall, use a HIDS (like Tripwire ($), OSSEC, samhain), use something to monitor your memory pools, like nagios and nsclient++ performance counter checks. ' ' ' TODO: have script work with wrapper.vbs in NSClient++, allowing it to be wrapped. ' ' :mbrownnyc on freenode ' ' ' 'Here's an example line to run on the NSClient++ host: 'cscript.exe //T:30 //NoLogo check_smb_speed.vbs /H:192.168.20.25 /P:\test$ /READ /S:64 /N:10 /w:500 /c:700 ' 'NSClient++ nsc.ini Configuration considerations/instructions: ' 'In nsc.ini on the NSClient++ host: ' [modules] ' NRPEListener.dll ' CheckExternalScripts.dll ' [Settings] ' use_file=1 ' allowed_hosts=[ip of your nagios poller] ' [NRPE] ' allow_arguments=1 ' allow_nasty_meta_chars=1 ;READ UP ON THIS! Other than the risk of DOS explained above, this script should not suffer from injection attacks. ' use_ssl=1 ' allowed_hosts=[ip of your nagios poller] ' [External Scripts] ' allow_arguments=1 ' allow_nasty_meta_chars=1 ;READ UP ON THIS! Other than the risk of DOS explained above, this script should not suffer from injection attacks. ' check_smb_speed=cscript.exe //T:30 //NoLogo scripts\check_smb_speed.vbs /H:$ARG1$ /P:$ARG2$ $ARG3$ /S:$ARG4$ /N:$ARG5$ /w:$ARG6$ /c:$ARG7$ ' 'Where: '$ARG1$ = hostname of the target of the /READ or /WRITE (the source of the /READ or /WRITE request will be the NSclient++ host, of course) '$ARG2$ = path to remote file '$ARG3$ = /READ or /WRITE '$ARG4$ = size of file (only used during /WRITE, script doesn't error if provided during /READ) '$ARG5$ = number of times to read or write the file '$ARG6$ = warning level (the script returns ms, so this is the length of time the /READ or /WRITE takes that is considered WARNING) '$ARG7$ = critical level (the script returns ms, so this is the length of time the /READ or /WRITE takes that is considered CRITICAL) ' 'Here's an example test check_nrpe command ' you must increase the time out with the -t as most of the full script execution will take over the 10 second default socket timeout... You should be able to tweak this tighter if you wish by reviewing the command latency measured by nagios. '/usr/local/nagios/libexec/check_nrpe -t 30 -H 192.168.20.25 -c check_smb_speed -a 192.168.20.25 '\test$' /READ 64 10 500 700 ' ' 'Here is a command definition for nagios: ' cat nagios.cfg | grep cfg_dir will return the location of you checkcommands.cfg ' 'define command{ ' command_name check_smb_speed ' command_line $USER1$/check_nrpe -t 30 -H $HOSTADDRESS$ -c check_smb_speed -a $ARG1$ $ARG2$ $ARG3$ $ARG4$ $ARG5$ $ARG6$ $ARG7$ '} ' ' 'Here is a service definition for nagios: ' cat nagios.cfg | grep cfg_dir will return the location of you services.cfg) ' 'define service{ ' host_name win2k3-testb ' service_description SMB Read Speed to 192.168.20.25 ' _SERVICE_ID 130 ' use generic-service ' check_command check_smb_speed!192.168.20.25!\\\\test_target_share$!/READ!64!10!500!700 ' check_period 24x7 ' notifications_enabled 1 ' contacts Matt '} ' \\\\ will pass a '\', if it is not nested in quotes. It is necessary to pass this, due to my stubborness of not wanting to put the backslash into the script. 'And if you're not too cool for school and use Centreon, when you created the service, you can assign it the graph template "latency" 'And finally... our script: ' DEBUGging is defined: replace 'wscript.echo "DEBUG: " with wscript.echo "DEBUG: " 'On Error Resume Next set Args = wscript.arguments.named set fso=createobject("scripting.filesystemobject") dim host, path, size, iterations dim OperationType, TotalTime, NumberOfFilesToWriteOrRead dim WARNING_level, CRITICAL_level Main() wscript.quit function Main() if args.count = 0 then IfYouGetCaught "help", "args.count = 0" end if if args.exists("help") then 'help IfYouGetCaught "help", "args.exists(""help"")" end if if args.exists("w") then 'WARNING_level WARNING_level = (cdbl(args.item("w"))) 'wscript.echo "DEBUG: " & "WARNING_level: " & WARNING_level & "ms" else IfYouGetCaught "help", "args.exists(""w"") doesn't exist" end if if args.exists("c") then 'CRITICAL_level CRITICAL_level = (cdbl(args.item("c"))) 'wscript.echo "DEBUG: " & "CRITICAL_level: " & CRITICAL_level & "ms" else IfYouGetCaught "help", "args.exists(""c"") doesn't exist" end if if args.exists("H") then 'host host = args.item("H") hostisup(host) else ' if no H IfYouGetCaught "help", "args.exists(""H"") doesn't exist" end if if args.exists("P") then 'path path = args.item("P") 'don't use localhost as the name, NetBIOS uses if fso.folderexists("\\" & host & path) = true then 'wscript.echo "DEBUG: " & "Setting folder " & "\\" & host & path 'TargetFolder = fso.getfolder("\\" & host & path) ' there is no need to get a folder object specifically, just set a string TargetFolder = "\\" & host & path 'wscript.echo "DEBUG: " & "DONE: Setting folder " & "\\" & host & path else IfYouGetCaught "nofolder", "no folder" end if else '... IfYouGetCaught "help", "args.exists(""P"") doesn't exist" end if if args.exists("N") then 'NumberOfFilesToWriteOrRead ' try to convert N to an int NumberOfFilesToWriteOrRead = cdbl(args.item("N")) ' error if error if err.number <> 0 then 'there has been an error 'wscript.echo "Error: " & Err.Number 'wscript.echo "Source: " & Err.Source 'wscript.echo "Description: " & Err.Description err.clear wscript.quit end if else '... IfYouGetCaught "help", "args.exists(""N"") doesn't exist" end if if args.exists("WRITE") then 'WRITE OperationType = "WRITE" if args.exists("S") then 'size size = cdbl(args.item("S")) 'wscript.echo "DEBUG: " & "size to write in KB: " & size 'number is delivered in KB: 1024 for example size = size * 1024 'convert size to bytes by multiplying by 1024 'wscript.echo "DEBUG: " & "size to write in bytes: " & size 'wscript.echo "DEBUG: " & "files to be written: " & NumberOfFilesToWriteOrRead 'you must delete all the files first (deleting seems to take a while, so for sake of clocking the write times, deletion is separated) for i = 1 to NumberOfFilesToWriteOrRead DeleteFiles TargetFolder & "\", size, i next 'the main part... writing the file(s) for i = 1 to NumberOfFilesToWriteOrRead 'wscript.echo "DEBUG: " & "TargetFolder: " & TargetFolder WriteFile TargetFolder & "\", size, i next else '... IfYouGetCaught "help", "args.exists(""S"") doesn't exist" end if elseif args.exists("READ") then 'READ OperationType = "READ" if args.exists("S") then 'size is irrelevant during reads, always reading in 64KB chunks (SMB 1.0), and always reading _smbtestfilei until N 'do nothing end if 'Check if file(s) exists for i = 1 to NumberOfFilesToWriteOrRead 'wscript.echo "DEBUG: " & "TargetFolder & \ & _smbtestfile & i: " & TargetFolder & "\" & "_smbtestfile" & i 'wscript.echo "DEBUG: " & "file exists? " & fso.fileexists(TargetFolder & "\" & "_smbtestfile" & i) 'if fso.fileexists(TargetFolder & "\" & "_smbtestfile" & i) = false then 'if not write the things WriteFile TargetFolder & "\", READING, i 'end if next 'Read the files for i = 1 to NumberOfFilesToWriteOrRead ReadFile TargetFolder & "\", i next else IfYouGetCaught "help", "neither args.exists(""READ"") or args.exists(""WRITE"")" end if ReturnCheckInfo() end function function IfYouGetCaught(ErrStr, source) if ErrStr = "help" then wscript.echo "UNKNOWN | " wscript.echo "Error Source: " & source wscript.echo "" wscript.echo " cscript check_smb_speed.vbs /H:[host address or DNS name] /P:[path to local or remote file] [/READ||/WRITE] [/S:[size of file in KB]] /N:[count of file(s)] /w:[warning level in milliseconds] /c:[critical level in milliseconds]" wscript.echo "" wscript.echo "" wscript.echo "!!! Writes are made in 64KB chunks." wscript.echo "" wscript.echo "example:" wscript.echo " cscript check_smb_speed.vbs /H:192.168.0.110 /P:\smbtestshare$ /WRITE /S:1024 /N:10 /w:500 /c:1000" wscript.echo " This writes ten 1024KB files named _smbtestfilei (where i is the increment of N) to the folder \\192.168.0.110\smbtestshare$\" wscript.echo "" wscript.echo "example:" wscript.echo " cscript check_smb_speed.vbs /H:localhost /P:\c$ /READ /N:1 /w:300 /c:700" wscript.echo " This reads once the file _smbtestfile1 in the folder \\localhost\smbtestshare$\." wscript.echo " If it doesn't exist, it is created." wscript.echo " This creation of the file is not counted toward the total read time (value returned by the script)." wscript.quit 3 end if if ErrStr = "nofolder" then 'wscript.echo "DEBUG: " & "" 'wscript.echo "DEBUG: " & "There ain't no foldah here: " & "\\" & host & path wscript.quit 3 end if end function function hostisup(host) 'ping host set PingResults = GetObject("winmgmts://./root/cimv2").ExecQuery("SELECT * FROM Win32_PingStatus WHERE Address = '" & host & "'") for each result in PingResults if IsNull(result.StatusCode) or result.StatusCode <> 0 then hostisup = false else hostisup = true end if next end function function folderexists(path) 'check if directory exists if fso.folderexists(path) = true then folderexists = true else ' if not, fail folderexists = false end if end function function DeleteFiles(folder, size, iteration) 'wscript.echo "DEBUG: " & "starting to delete files" 'size has been delivered in bytes StopWritingAtThis = size/65536 '65536 is the number of bytes in 64KB... if the total size is 1048576 bytes, 1048576/65536 = 16 if StopWritingAtThis < 1 then StopWritingAtThis = 1 end if i = 0 for i = 1 to StopWritingAtThis deletefilepath = folder & "_smbtestfile" & iteration 'wscript.echo "DEBUG: " & "folder & _smbtestfile & iteration: " & deletefilepath 'fso.fileexists doesn't like "\\" in the path, other than the beginning. 'if instr(2, deletefilepath, "\\") = instrrev(deletefilepath, "\\") then ' there is a double backslash in the path ' deletefilepath = left(deletefilepath, instr(2, deletefilepath, "\\")) & right(deletefilepath, len(deletefilepath) - instrrev(deletefilepath, "\\") - 1) 'end if 'wscript.echo "DEBUG: " & "fso.fileexists(deletefilename): " & fso.fileexists(deletefilepath) On Error Resume Next 'if fso.fileexists(deletefilepath) = true then 'wscript.echo "DEBUG: " & " DELETING: " & deletefilepath fso.deletefile deletefilepath, true 'wscript.echo "DEBUG: " & " DELETED: " & deletefilepath 'end if next end function function WriteFile(folder, size, iteration) 'wscript.echo "DEBUG: " & "write file starts..." if size <> "READING" then 'size has been delivered in bytes StopWritingAtThis = size/65536 '65536 is the number of bytes in 64KB... if the total size is 1048576 bytes, 1048576/65536 = 16 if StopWritingAtThis < 1 then StopWritingAtThis = 1 end if 'wscript.echo "DEBUG: " & " StopWritingAtThis: " & StopWritingAtThis end if 'declare a byte ASCIIByte = chr(000) 'null, cause that's how we roll 'build the chunk that you will write, of 64KB for i = 1 to 65536 'this is the maximum size of an SMB 1.0 block, 64KB FileChunk = FileChunk & ASCIIByte next 'wscript.echo "DEBUG: " & "The file chunk size is: " & len(Filechunk) & " bytes" if size <> "READING" then '''''start timer to count against write timer 'this will be cumulatively built for each writing execution. 'this stops the deletions from affecting the time StartTime = timer end if 'wscript.echo "DEBUG: " & " Creating text file " & folder & "_smbtestfile" & iteration 'create a text file who's coded in ASCII (it matters because of the size of our byte, is one ASCII character) set TextFileToWrite = fso.createtextfile(folder & "_smbtestfile" & iteration, true, false) 'wscript.echo "DEBUG: " & " DONE: Creating text file" 'try to write to the folder i = 0 for i = 1 to StopWritingAtThis 'write the 64KB blocks, 16 times in our example. TextFileToWrite.write(FileChunk) next if size <> "READING" then '''''stop timer to count against write timer EndTime = timer TotalTime = TotalTime + (EndTime-StartTime) 'wscript.echo "DEBUG: " & "Total time to write file " & "_smbtestfile" & iteration & ": " & EndTime & "-" & StartTime & "=" & EndTime-StartTime end if 'catch errors 'if err.number <> 0 then 'there has been an error ' 'wscript.echo "DEBUG: " & "Error: " & Err.Number ' 'wscript.echo "DEBUG: " & "Source: " & Err.Source ' 'wscript.echo "DEBUG: " & "Description: " & Err.Description ' err.clear ' wscript.quit ' end if 'wscript.echo "DEBUG: " & "write file ends" end function function ReadFile(folder, iteration) '''''start timer to count against write timer 'this will be cumulatively built for each writing execution. 'this stops the deletions from affecting the time StartTime = timer 'open file for read set TextFileToRead = fso.OpenTextFile(folder & "_smbtestfile" & iteration, 1, false, 0) 'read the file in chunks of 64KB, until do while TextFileToRead.atendofstream = false text = TextFileToRead.read(65536) 'read 65536 characters Loop '''''stop timer to count against write timer EndTime = timer TotalTime = TotalTime + (EndTime-StartTime) 'wscript.echo "DEBUG: " & "Total time to READ file " & "_smbtestfile" & iteration & ": " & EndTime & "-" & StartTime & "=" & EndTime-StartTime end function function ReturnCheckInfo() 'convert TotalTime from seconds to ms TotalTime = TotalTime*1000 'wscript.echo "DEBUG: " & "Total time of " & OperationType & " operation (seconds): " & TotalTime if OperationType = "READ" then if TotalTime > CRITICAL_level then wscript.echo "SMB_READTIME CRITICAL - read " & NumberOfFilesToWriteOrRead & " files in " & TotalTime & " ms|time=" & totaltime & "ms;" & WARNING_level & ";" & CRITICAL_level & "; ok=0" wscript.quit 2 elseif TotalTime > WARNING_level then wscript.echo "SMB_READTIME WARNING - read " & NumberOfFilesToWriteOrRead & " files in " & TotalTime & " ms|time=" & totaltime & "ms;" & WARNING_level & ";" & CRITICAL_level & "; ok=0" wscript.quit 1 else wscript.echo "SMB_READTIME OK - read " & NumberOfFilesToWriteOrRead & " files in " & TotalTime & " ms|time=" & totaltime & "ms;" & WARNING_level & ";" & CRITICAL_level & "; ok=1" wscript.quit 0 end if end if if OperationType = "WRITE" then if TotalTime > CRITICAL_level then wscript.echo "SMB_WRITETIME CRITICAL - wrote " & NumberOfFilesToWriteOrRead & " files in " & TotalTime & " ms|time=" & totaltime & "ms;" & WARNING_level & ";" & CRITICAL_level & "; ok=0" wscript.quit 2 elseif TotalTime > WARNING_level then wscript.echo "SMB_WRITETIME WARNING - wrote " & NumberOfFilesToWriteOrRead & " files in " & TotalTime & " ms|time=" & totaltime & "ms;" & WARNING_level & ";" & CRITICAL_level & "; ok=0" wscript.quit 1 else wscript.echo "SMB_WRITETIME OK - wrote " & NumberOfFilesToWriteOrRead & " files in " & TotalTime & " ms|time=" & totaltime & "ms;" & WARNING_level & ";" & CRITICAL_level & "; ok=1" wscript.quit 0 end if end if end function 'You are now talking on #nagios ' hey all ' how do i make sure that two service checks don't occur within a set amount of time with each other? ' service 1's active check occurs at time 0, service 2's active check occurs at time 2.5 ' timeperiods. ' mbrownnyc: are you trying to ensure that check 2 occurs 2.5 minutes after check 1? ' yes ' valcor, thanks ' well, explicitly, I am more flexible ' but about 2.5 minutes yes ' dnsmichi: I don't think timeperiods works in this case :) ' mbrownnyc: without much tricker, there is no way that I know of ' you could write a wrapper script on check 1, that reschedules check 2, and executes the original check 1 script ' that's very complex ' not as hard as it sounds ' but a good idea ' if that's the only way ' http://old.nagios.org/developerinfo/externalcommands/commandinfo.php?command_id=29 ' I have written a plugin check_smb_speed, which is executed over NRPE or NCSA ' I am using NRPE right now ' Valcor: uhm i though the numbers were times of the day ' +t ' but if they are start+end time, timepriods are useless, true. ' i have a _desire_ to execute on that schedule to allow caches to be "flushed" (0 minute, 2.5 minute) ' dnsmichi: understood, that's why I asked for clarification :) ' the hdd subsystem is pretty well utilized (over SMB), so I expect the caches will be flushed by that time ' but I can not quantify this ' yep thx. monday is always the not-yet-100%-power-day. ' I don't have access to these numbers, per se ' in fact, they may not be "flushed" by 2.5 minutes ' they may not be flushed in 6000 minutes if nothing is used ' but, the results should reveal this in itself ' hm, another attempt would be calling an event handler, passing the timestamp when to be checked somewhere, then a cron reads that every minute, and if matched, the passive check result is inserted into the command pipe ' something like the wrapper Valcor proposed ' if all of a sudden I jump from my baseline, let's say 350ms to 100ms, then there's a problem and the caches are being utilized too well ' main problem is that if you use that call as an active check, 150 sec for a script run will ruin the scheduler ' the script latency is only about 15 seconds, 45 at max, but yes i understand ' basically, I'll let it run in prod for some time and judge how to react ' if my described instance occurs (caches being too well utilized, hence the test doesn't reflect real world at all), I will go back to the drawing board ' thanks for your'ss' inputs