Monitoring Dell Blade Servers with Hardware Sentry
KB1040 - Sep 16, 2010 - Last reviewed on Aug 01, 2013
Type: Best Practice
Description: Monitoring Dell Blade Servers with Hardware Sentry KM. Supported communication protocols and discovered hardware components.
Additional Keywords: Blade, Blade Servers, Chassis, Dell, DRAC, Telnet
- The Hardware Instrumentation section covers the basic Dell Blade Servers hardware components and the supported communication protocols.
- The Setting up... section provides detailed instructions for installing and configuring Hardware Sentry KM for PATROL to properly monitor Dell Blade Servers.
- The Discovered Components and Monitored Parameters section lists the components automatically discovered and monitored by Hardware Sentry KM for PATROL as well as the connectors required to ensure a proper monitoring.
- The Troubleshooting section gathers the most frequently asked questions related to issues that stem from installation or configuration issues.
About Dell Blade Servers
Some Dell PowerEdge servers (like the 1855 and 1955 models) are what is commonly called “blades”, i.e. servers to be “inserted” in a main chassis shared by several blades. The main enclosure provides the electrical power, cooling, network switching and media devices to all of the blades installed in it.
Such an enclosure is not a computer per se and no software can be installed on it.
Dell’s blade enclosures are called Dell Modular Server Enclosure and are equipped with a DRAC/MC card (Dell Remote Access Controller/Modular Chassis) as the management interface for the enclosure which lets administrators locate and diagnose problems in the chassis and configure the enclosure.
Dell Modular Server Enclosures (more commonly known as Dell Blade servers) have a card named “DRAC” but this version of the Dell Remote Access Controller is slightly different from the one in regular PowerEdge servers and is actually named “DRAC/MC”.
Hardware Sentry uses the Telnet protocol to connect to the DRAC/MC card and to discover and monitor the hardware pieces of the main shared enclosure or Dell Blade servers.
As no software can be installed on the shared enclosure, Hardware Sentry needs to be installed on a separate system and then configured to poll the DRAC/MC card of the chassis over the network.
Most often, Hardware Sentry is installed on a blade in the chassis to be monitored. The advantage is that the hardware monitoring solution runs on the platform itself and there is no need for an external independent system to monitor the Dell blade chassis. The down side is that the hardware monitoring solution will stop as soon as the blade is taken down.
- The DRAC/MC card of the Dell Modular Server Enclosure must be configured to operate on the network. The Telnet protocol must be enabled and a proper user account is to be created and configured for the hardware monitoring solution. Full administrative rights are not required for Hardware Sentry.
- The system from which the blade chassis will be monitored must be able to communicate with the DRAC/MC of the chassis. It must run either Microsoft Windows or Linux and Sun’s Java JRE should be installed (version 1.5.00 minimum).
- Install the PATROL Agent on the system in charge of monitoring the shared chassis.
- Install Hardware Sentry KM for PATROL on the server (this can be done at the same time as the PATROL Agent). Please follow the instructions of the Installation Guide of Hardware Sentry.
Once installed, Hardware Sentry should monitor the hardware of the local system. If Hardware Sentry is installed on a blade in the chassis and if the installation procedure for Dell PowerEdge servers has been carefully followed, it should display the blade model i.e. PowerEdge 1855 etc., as well as its internal components under an icon named Hardware. In order to monitor the shared chassis:
- Right-click the main Hardware icon › KM Command › Add a Remote System or An External Device…
- Enter a display name for the shared chassis as well as the IP address of the DRAC/MC card
- Select System Type "Management Card/Chip, Blade Chassis, ESXi”
- Select Manually choose which connectors to use and click Next.
- Select Dell DRAC/MC (Dell Remote Access Controller/Modular Chassis) in the connector list and click Next
- Select Telnet for the protocol and enter valid credentials to connect to the DRAC/MC, click Next and Finish
The single Hardware icon is renamed Hardware on localhost and another icon is created next to it: Hardware on name of the Dell blade chassis.
The following connector is to be selected in order to monitor the blade server enclosure.
- Dell DRAC/MC (Dell Remote Access Controller/Modular Chassis)
Components and monitored parameters
The following components and parameters are discovered and monitored:
- Chassis model and serial (tag) number
- Per-chassis overall status
- Power consumption of the whole system (shared chassis and blades inside)
- Temperature sensors, actual temperature
- Fans, speed of each fan
- Voltage sensors, actual voltage
- Power supplies, status
- Blades, model, serial number, status
- I/O switches, status
- DRAC/MC cards, status
- KVM module, status
Hardware Sentry cannot connect to the DRAC/MC card through Telnet if it does not find the Java JRE
If the instance of Hardware Sentry set up to monitor the blade chassis is running on Windows, it will automatically find the Java JRE based on the PATH and JAVA_HOME system environment variables.
It proceeds the same way on Linux systems except that most of the time, the JAVA_HOME variable is not properly set for the PATROL Agent default account. In such case, you can either modify the environment of the PATROL Agent’s default account to have the JAVA_HOME variable properly set or explicitly setting in the agent’s configuration the /SENTRY/HARDWARE/javaPath variable to the bin folder that contains the “java” executable (example: /opt/java/jre/bin).
DRAC/MC fails to respond to Telnet requests
It could happen that the DRAC/MC randomly fails to respond to Hardware Sentry’s Telnet requests. In such case, you would get some error messages in the PATROL Console System Output Windows (“No collect value for…”). In the worst case, you can get an alert on all components of the shared chassis, as if they were all missing. This problem is due to the Telnet server of the DRAC/MC card being too slow. If you experience such issue (all components marked as “missing” and in alarm), we recommend that you disable the “Missing Device Detection” mechanism of Hardware Sentry.