
VMWare
Check hardware running VMware ESXi
Description:
Python script which permits you to check your hardware’s health when it runs VMware (free) ESXi appliance.
Current Version
Last Release Date
June 12, 2009
Compatible With
Owner
Project Files
File | Description |
---|---|
check_esx_wbem.py | check_esx_wbem.py |
Project Notes
Running (free) VMware ESXi don't let you install your own manufacturer agents (Dell Openmanage for example). By using this python script which queries VMware CIM agent you can monitor your hardware's global health.
Pre-req : python with pywbem module
Usage : ./check_esx_wbem.py hostname user password [verbose]
Example : ./check_esx_wbem.py https://myesxi:5989 root password
Reviews
(10)
Add a Review
The plugin works fine with ESXi 4.1 (the new licensed version).
If you do not want to use the root account, but a dedicated account for monitoring, you can try this steps:
in vSphere
- create a nagios user
- add this user to the root group
- Assign the "No access" role to the nagios user
Youll'have the right to access from the plugin, but the nagios user will not be able to access with the vsphere client, the console and, SSH.
If you do not want to use the root account, but a dedicated account for monitoring, you can try this steps:
in vSphere
- create a nagios user
- add this user to the root group
- Assign the "No access" role to the nagios user
Youll'have the right to access from the plugin, but the nagios user will not be able to access with the vsphere client, the console and, SSH.
here is a guide to set it up working to save you some time (gentoo configuration)
1. Emerge python
2. Download pywbem-0.7.0.tar.gz from http://sourceforge.net/projects/pywbem/files/pywbem/pywbem-0.7/
3. tar –xvf pywbem-0.7.0.tar.gz
4. cd pywbem-0.7.0.tar.gz
5. python setup.py build
6. python setup.py install
7. to test pywbem : $ python
Python 2.6.6 (r266:84292, Sep 14 2011, 06:53:15)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pywbem
>>>
8. srvmon-Bellakt plugins # ./check_esx_wbem.py https://10.1.1.xxx:5989 root password
if you get the following error (i was testing on a virtual machine)
Traceback (most recent call last):
File "./check_esx_wbem.py", line 75, in
instance_list = wbemclient.EnumerateInstances(classe)
File "/usr/lib/python2.6/site-packages/pywbem/cim_operations.py", line 404, in EnumerateInstances
**params)
File "/usr/lib/python2.6/site-packages/pywbem/cim_operations.py", line 219, in imethodcall
raise CIMError(code, tt[1]['DESCRIPTION'])
pywbem.cim_operations.CIMError: (6, u'The requested object could not be found')
9. to remove this error comment out memory (# 'CIM_Memory',)
1. Emerge python
2. Download pywbem-0.7.0.tar.gz from http://sourceforge.net/projects/pywbem/files/pywbem/pywbem-0.7/
3. tar –xvf pywbem-0.7.0.tar.gz
4. cd pywbem-0.7.0.tar.gz
5. python setup.py build
6. python setup.py install
7. to test pywbem : $ python
Python 2.6.6 (r266:84292, Sep 14 2011, 06:53:15)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pywbem
>>>
8. srvmon-Bellakt plugins # ./check_esx_wbem.py https://10.1.1.xxx:5989 root password
if you get the following error (i was testing on a virtual machine)
Traceback (most recent call last):
File "./check_esx_wbem.py", line 75, in
instance_list = wbemclient.EnumerateInstances(classe)
File "/usr/lib/python2.6/site-packages/pywbem/cim_operations.py", line 404, in EnumerateInstances
**params)
File "/usr/lib/python2.6/site-packages/pywbem/cim_operations.py", line 219, in imethodcall
raise CIMError(code, tt[1]['DESCRIPTION'])
pywbem.cim_operations.CIMError: (6, u'The requested object could not be found')
9. to remove this error comment out memory (# 'CIM_Memory',)
The version I am running has been through six revisions compared to this version.
Here's the change log of the version I have:
#@---------------------------------------------------
#@ History
#@---------------------------------------------------
#@ Date : 20080820
#@ Author : David Ligeret
#@ Reason : Initial release
#@---------------------------------------------------
#@ Date : 20080821
#@ Author : David Ligeret
#@ Reason : Add verbose mode
#@---------------------------------------------------
#@ Date : 20090219
#@ Author : Joshua Daniel Franklin
#@ Reason : Add try/except to catch AuthError and CIMError
#@---------------------------------------------------
#@ Date : 20100202
#@ Author : Branden Schneider
#@ Reason : Added HP Support (HealthState)
#@---------------------------------------------------
#@ Date : 20100512
#@ Author : Claudio Kuenzler www.claudiokuenzler.com
#@ Reason : Combined different versions (Joshua and Branden)
#@ Reason : Added hardware type switch (dell or hp)
#@---------------------------------------------------
#@ Date : 20100626/28
#@ Author : Samir Ibradzic www.brastel.com
#@ Reason : Added basic server info
#@ Reason : Wanted to have server name, serial number & bios version at output
#@ Reason : Set default return status to Unknown
#@---------------------------------------------------
#@ Date : 20100702
#@ Author : Aaron Rogers www.cloudmark.com
#@ Reason : GlobalStatus was incorrectly getting (re)set to OK with every CIM element check
#@---------------------------------------------------
#@ Date : 20100705
#@ Author : Claudio Kuenzler www.claudiokuenzler.com
#@ Reason : Due to change 20100702 all Dell servers would return UNKNOWN instead of OK...
#@ Reason : ... so added Aaron's logic at the end of the Dell checks as well
Here's the change log of the version I have:
#@---------------------------------------------------
#@ History
#@---------------------------------------------------
#@ Date : 20080820
#@ Author : David Ligeret
#@ Reason : Initial release
#@---------------------------------------------------
#@ Date : 20080821
#@ Author : David Ligeret
#@ Reason : Add verbose mode
#@---------------------------------------------------
#@ Date : 20090219
#@ Author : Joshua Daniel Franklin
#@ Reason : Add try/except to catch AuthError and CIMError
#@---------------------------------------------------
#@ Date : 20100202
#@ Author : Branden Schneider
#@ Reason : Added HP Support (HealthState)
#@---------------------------------------------------
#@ Date : 20100512
#@ Author : Claudio Kuenzler www.claudiokuenzler.com
#@ Reason : Combined different versions (Joshua and Branden)
#@ Reason : Added hardware type switch (dell or hp)
#@---------------------------------------------------
#@ Date : 20100626/28
#@ Author : Samir Ibradzic www.brastel.com
#@ Reason : Added basic server info
#@ Reason : Wanted to have server name, serial number & bios version at output
#@ Reason : Set default return status to Unknown
#@---------------------------------------------------
#@ Date : 20100702
#@ Author : Aaron Rogers www.cloudmark.com
#@ Reason : GlobalStatus was incorrectly getting (re)set to OK with every CIM element check
#@---------------------------------------------------
#@ Date : 20100705
#@ Author : Claudio Kuenzler www.claudiokuenzler.com
#@ Reason : Due to change 20100702 all Dell servers would return UNKNOWN instead of OK...
#@ Reason : ... so added Aaron's logic at the end of the Dell checks as well
Hi,
great plugin, the only thing I recognized is, that it detect disk errors(as an experiment, I pulled out a disk out of raid 5), but the status is not set to warning or critical.
Here is the verbose output:
20100525 16:49:12 Check classe CIM_ComputerSystem
20100525 16:49:12 Element Name = System Board 7:1
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:1
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:2
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:5
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = esx-test01.roland-domaene.intra
20100525 16:49:12 Element Name = Hardware Management Controller (Node 0)
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = HP Smart Array P400 Controller : HPSA1
20100525 16:49:12 Check classe CIM_NumericSensor
20100525 16:49:12 Element Name = System Board 1 Power Meter
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Power Domain 1 Temp 5
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 11 Temp 4
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 10 Temp 3
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = External Environment 9 Temp 2
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 5 Temp 1
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 4 Fan 6
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 3 Fan 5
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 2 Fan 4
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 1 Fan 3
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 2 Fan 2
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 1 Fan 1
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Check classe CIM_Memory
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-3 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Memory
20100525 16:49:12 Check classe CIM_Processor
20100525 16:49:13 Element Name = Proc 1
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Check classe CIM_RecordLog
20100525 16:49:13 Element Name = IPMI SEL
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Check classe OMC_DiscreteSensor
20100525 16:49:13 Element Name = System Board 1 Fans
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Processor Module 1 VRM 1
20100525 16:49:13 Element Name = Power Supply 3 Power Supplies
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Power Supply 2 Power Supply 2: Presence detected
20100525 16:49:13 Element Name = Power Supply 2 Power Supply 2: Failure detected
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Power Supply 1 Power Supply 1: Presence detected
20100525 16:49:13 Element Name = Power Supply 1 Power Supply 1: Failure detected
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = System Chassis 3 Ext. Health LED
20100525 16:49:13 Element Name = System Chassis 2 Int. Health LED
20100525 16:49:13 Element Name = System Chassis 1 UID Light
20100525 16:49:13 Check classe VMware_StorageExtent
20100525 16:49:13 Element Name = Disk 1 on HPSA1 : Port 1I Box 1 Bay 6 : 136GB : Data Disk : Disk Error
20100525 16:49:13 Element Name = Disk 2 on HPSA1 : Port 1I Box 1 Bay 5 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 3 on HPSA1 : Port 2I Box 1 Bay 3 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 4 on HPSA1 : Port 2I Box 1 Bay 2 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 5 on HPSA1 : Port 2I Box 1 Bay 1 : 136GB : Data Disk
20100525 16:49:13 Check classe VMware_Controller
20100525 16:49:13 Element Name = HP Smart Array P400 Controller : HPSA1
20100525 16:49:13 Check classe VMware_StorageVolume
20100525 16:49:13 Element Name = Logical Volume 1 on HPSA1 : RAID 5 : 546GB : Disk 1,2,3,4,5 : Interim Recovery
20100525 16:49:13 Check classe VMware_Battery
20100525 16:49:14 Element Name = Battery on HPSA1
20100525 16:49:14 Check classe VMware_SASSATAPort
OK
great plugin, the only thing I recognized is, that it detect disk errors(as an experiment, I pulled out a disk out of raid 5), but the status is not set to warning or critical.
Here is the verbose output:
20100525 16:49:12 Check classe CIM_ComputerSystem
20100525 16:49:12 Element Name = System Board 7:1
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:1
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:2
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = System Internal Expansion Board 16:5
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = esx-test01.roland-domaene.intra
20100525 16:49:12 Element Name = Hardware Management Controller (Node 0)
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = HP Smart Array P400 Controller : HPSA1
20100525 16:49:12 Check classe CIM_NumericSensor
20100525 16:49:12 Element Name = System Board 1 Power Meter
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Power Domain 1 Temp 5
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 11 Temp 4
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 10 Temp 3
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = External Environment 9 Temp 2
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 5 Temp 1
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 4 Fan 6
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 3 Fan 5
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 2 Fan 4
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = Processor 1 Fan 3
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 2 Fan 2
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Element Name = System Internal Expansion Board 1 Fan 1
20100525 16:49:12 Element Op Status = 2
20100525 16:49:12 Check classe CIM_Memory
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-1 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-2 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Proc 1 Level-3 Cache
20100525 16:49:12 Element Op Status = 0
20100525 16:49:12 Element Name = Memory
20100525 16:49:12 Check classe CIM_Processor
20100525 16:49:13 Element Name = Proc 1
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Check classe CIM_RecordLog
20100525 16:49:13 Element Name = IPMI SEL
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Check classe OMC_DiscreteSensor
20100525 16:49:13 Element Name = System Board 1 Fans
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Processor Module 1 VRM 1
20100525 16:49:13 Element Name = Power Supply 3 Power Supplies
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Power Supply 2 Power Supply 2: Presence detected
20100525 16:49:13 Element Name = Power Supply 2 Power Supply 2: Failure detected
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = Power Supply 1 Power Supply 1: Presence detected
20100525 16:49:13 Element Name = Power Supply 1 Power Supply 1: Failure detected
20100525 16:49:13 Element Op Status = 2
20100525 16:49:13 Element Name = System Chassis 3 Ext. Health LED
20100525 16:49:13 Element Name = System Chassis 2 Int. Health LED
20100525 16:49:13 Element Name = System Chassis 1 UID Light
20100525 16:49:13 Check classe VMware_StorageExtent
20100525 16:49:13 Element Name = Disk 1 on HPSA1 : Port 1I Box 1 Bay 6 : 136GB : Data Disk : Disk Error
20100525 16:49:13 Element Name = Disk 2 on HPSA1 : Port 1I Box 1 Bay 5 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 3 on HPSA1 : Port 2I Box 1 Bay 3 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 4 on HPSA1 : Port 2I Box 1 Bay 2 : 136GB : Data Disk
20100525 16:49:13 Element Name = Disk 5 on HPSA1 : Port 2I Box 1 Bay 1 : 136GB : Data Disk
20100525 16:49:13 Check classe VMware_Controller
20100525 16:49:13 Element Name = HP Smart Array P400 Controller : HPSA1
20100525 16:49:13 Check classe VMware_StorageVolume
20100525 16:49:13 Element Name = Logical Volume 1 on HPSA1 : RAID 5 : 546GB : Disk 1,2,3,4,5 : Interim Recovery
20100525 16:49:13 Check classe VMware_Battery
20100525 16:49:14 Element Name = Battery on HPSA1
20100525 16:49:14 Check classe VMware_SASSATAPort
OK
Page Sections
Project Stats
Rating
4.3 (15)
Favorites
4
Views
230,067