Configuring Hardware Sentry KM for PATROL

By default, Hardware Sentry is configured to best match business requirements. However, various configuration options are available to meet your specific needs.

Configuring the KM settings

Various configuration options are available if administrators wish to:

Customizing the Discovery Interval

Hardware Sentry performs discoveries to detect new hardware components or detect those that have gone missing.

By default, the Hardware Sentry runs a discovery every hour, but you can customize this interval by right-clicking on the Hardware icon > KM Commands > KM Settings > Discovery Interval.

Customizing Discovery Cycle

Customizing the Polling Interval

Hardware Sentry polls the managed systems to collect hardware health data. By default, the polling interval for this “data-collect” is set to every 2 minutes.

To change the polling interval for the managed system, right-click on the Hardware icon > KM Commands > KM Settings > Polling Interval…

Customizing Polling Interval

In addition to the polling interval that is a global setting, i.e. it is applicable to the entire managed infrastructure, you can also manually trigger a poll at any time on individual instances to refresh parameter values: right-click on the instance icon > Refresh Parameters option.

Configuring Java Settings

The Java Settings wizard enables you to define which Java Runtime Environment (JRE) you wish Hardware Sentry to use. You can either use the automatic detection, select a pre-detected java path or enter manually the path leading to the Java executable directory.

By default, the product tries to locate a compatible Java Runtime Environment (JRE) on the system where the PATROL Agent runs, and uses it to run the Java Collection Hub, a core piece of the product that allows performing operations that would not normally be possible on a PATROL Agent (SSH, SSL, SNMPv3, etc.).

Performing SSH, WBEM-based collection, or SNMP v2 and v3 (i.e. for most remote Linux and UNIX systems, and most storage systems) requires Java Runtime Environment (JRE) version 1.8 or higher.

To access the Java Settings wizard:

  1. Right-click the KM main icon > KM commands > KM Settings > Java Settings…

    Java Settings Configuration

  2. (Optional) Click the More Information button to display the status of the JRE compatibility check and the JRE path currently used by Hardware Sentry.

  3. Use the Path to Java Runtime Environment (JRE) field to specify the path of the JRE you want Hardware Sentry to use:
    • Automatically detected at runtime: The KM will automatically search and use a compatible JRE at the initial discovery that occurs when the PATROL Agent and the KM are started.
    • List of detected JREs: The KM displays a list of the JREs available on your system. Select the JRE you wish to use.
    • Other: Select this option if you wish to manually enter the Java executable directory path in the field provided below.
    • (Optional) If you have selected the Other option, you can select the Do NOT verify the compatibility of the specified JRE option to prevent Hardware Sentry from verifying the compatibility of the specified JRE. Use this option only if you know that the provided JRE is compatible even if the compatibility check fails.
  4. (Optional) Configure the Advanced Settings:

    Java Settings Configuration

    • Credentials: Provide the specific credentials that Hardware Sentry will use to start the Java Virtual Machine (JVM). By default, the java process is launched with the same credentials as the PATROL Agent. You can provide specific settings if the PATROL Agent’s default account does not have sufficient privileges to perform the operations required by the Java Collection Hub.
    • Java Collection Hub Heap Memory: Set the minimum and maximum size (in MB) of the Java Hub Heap Memory according to your environment requirements (Default minimum Java heap size = 128 MB and default maximum Java heap size = 512 MB).
    • Additional Java Collection Hub JVM Options: Enter arbitrary arguments to the java -jar … command line that Hardware Sentry uses to launch the Java Collection Hub in the Command line options field. This can be particularly useful for debugging (For example, enter Xdebug -Xrunjdwp:transport=dt_socket,address=4711,server=y,suspend=n to launch the KM in debug on a specific port).
    • Click Accept to save your settings. Hardware Sentry will automatically verify the relevance of the new settings and will warn you if a problem is detected. Changing the settings requires for Hardware Sentry to stop the Java Collection Hub for about 10 seconds and then to restart the Java Collection Hub. In the meantime, operations that leverage the Java Collection Hub will fail and Hardware Sentry will report an error in the corresponding Annotation point of the PATROL graph.
  5. Click Finish to save your changes.

Deactivating Device Classes

Hardware Sentry allows you to set aside device classes from the monitoring process and, therefore, discard the metrics and performance data for those specific devices.

The KM will not trigger any alarm if a failure on a discarded device was to occur.

To deactivate a device class:

  1. Right-click Hardware Sentry KM icon > KM Commands > KM Settings > Deactivate Device Class…
  2. Select the device class(es) you wish to discard from the monitoring process:

    Deactiving Device Classes

  3. Click OK to confirm.

To reintegrate a previously discarded device class, simply uncheck the corresponding option in the Deactivate Device Class window.

Configuring Thresholds

To make sure hardware failures are detected at an early stage, thresholds are automatically set by Hardware Sentry according to the manufacturers’ recommendation and alarms are triggered as soon as these thresholds are breached. However, administrators can:

  • customize these thresholds to meet their specific needs
  • specify which actions will be executed when a hardware failure is detected
  • configure several types of alerts and notifications (internal KM issues notification, intrusion detection alerts, unknown status, etc.).

Selecting the Threshold Mechanism

By default, Hardware Sentry automatically determines which mechanism (Tuning or Event Management) is best suited to the managed system when it first runs. This threshold mechanism selection can however be modified later on through the Threshold Mechanism Selection KM Command:

  1. Right-click the main Hardware Sentry/local host icon > KM Commands > KM Settings > Additional Settings > Threshold Mechanism Selection…

    Selecting the Threshold Mechanism

  2. Select one of the following options:

    • Tuning: Hardware Sentry manages its thresholds through the standard internal PATROL mechanism (override parameters). Thresholds are stored in the PATROL Agent configuration under the /___tuning___ tree. A PATROL Agent v3.4.11 or higher is required.
    • Event Management: the KM manages its thresholds through the Event Management mechanism. Thresholds are stored in the PATROL Agent configuration under the /AS tree. This option requires that you set up PATROL for Event Management KM on your PATROL Agent. PATROL for Event Management has to be enabled and preloaded.
    • No Thresholds: Hardware Sentry sets no thresholds on the monitored objects. You are required to set them manually.
  3. Click OK.

In order to avoid side effects and unpredictable behavior, if you change the thresholds management option from one method to another, Hardware Sentry will automatically migrate the instance specific thresholds to the other mode and will reset the ANYINST/ALL_INSTANCES thresholds with default values. Note that a backup of the Hardware Sentry thresholds configuration is made prior to the migration.

For more information about thresholds mechanism, please refer to our Knowledge Base articles Thresholds: the Event Management Thresholds Mechanism and Thresholds: the Tuning Thresholds Mechanism available on our Website.

Modifying Alert Thresholds

Hardware Sentry does not provide any specific way to customize thresholds. Thresholds can however be modified at any time using the standard tools provided by BMC such as pconfig, PCM or the Event Management KM. Hardware Sentry will not override the customizations.

Configuring Alert Actions and Notifications

Configuring Alert Actions

By default, upon hardware failure, Hardware Sentry triggers a PATROL event and annotates the parameter’s graph with a comprehensive report of the problem, giving details about the failure, the possible consequences and the recommended action to solve the problem. However, administrators can:

  • choose to trigger other actions when a hardware failure is detected such as executing OS or PSL commands, sending an E-mail, writing a line to a log file, etc.
  • customize the event messages using macros to obtain more information about the failure.

Editing Alert Actions

To modify the Alert Actions to be executed:

  1. Right-click the main Hardware icon > KM Commands > KM Settings > Edit Alert Actions…

    Editing Alert Actions — Selecting Parameters

  2. Check the boxes corresponding to the actions you would like to see executed upon a hardware failure:

  3. Click Next.

Triggering a PATROL Event

If you have selected Trigger a PATROL Event:

Editing Alert Actions — Alert Action: Trigger a PATROL Event

  • Select the type of PATROL event you wish to trigger when a hardware problem occurs:
Problem Type Event Type Event Class
Hardware Specific (default) HardwareProblem
Hardware Standard 41
  • In the Event Message field, enter the string that will be displayed with the event. You can use macros to customize the message and get more details about the problem.
  • Select the type of PATROL event you wish to trigger when a connector failure occurs:
Problem Type Event Type Event Class
Connector Standard (default) 41
Connector Specific ConnectorProblem
  • In the Event Message field, enter the string that will be displayed with the event. You can use macros to customize the message and get more details about the problem.

The triggered PATROL Event can then be viewed from:

  • Standard PATROL Consoles (Classic Console, PATROL Central)
  • PATROL Enterprise Manager
  • BMC Impact Manager
  • Other third-party products that interface with PATROL.

A Hardware Health Report is automatically generated when a PATROL Event occurs, to provide detailed information about the system impacted by the hardware failure or the threshold breach (ex: device type and label, serial number, etc.).

Annotating the Parameter’s Graph

If you have selected the Annotate the parameter’s graph action, you need to enter the string that will be displayed in the annotation point. You can use macros to customize the message and get more details about the problem.

Editing Alert Actions — Alert Action: Annotate Parameter’s Graph

Executing an OS Command

If you have selected the Execute an OS command action:

Editing Alert Actions — Alert Action: Execute an OS Command

  • Enter a command line to be executed. The command:

    • can be a program utility or a script shell, and can have arguments
    • can contain macros that will be replaced at runtime
    • must be non-interactive (no window, no user input).
  • Enter the username and password used to run the command.

  • Click Next and Finish.

Executing a PSL Command

PSL commands are for PATROL advanced users.

If you have selected the Execute a PSL command action, you need to specify the PSL statement to be executed by the PATROL Agent. Although only a single line is permitted, it can have several PSL instructions and contain alert action macros that will be replaced at runtime.

Editing Alert Actions — Alert Action: Execute a PSL Command

Sending a Pop-up to the PATROL Consoles

If you have selected the Send a pop-up to the PATROL Consoles action, you need to provide the message that will be displayed in the pop-up as well as in the title of the pop-up window. You can use alert action macros that will be replaced at runtime.

Editing Alert Actions — Alert Action: Send a Pop-up to the PATROL Consoles

Writing a Line to a LOG File

If you have selected the Write a line to a LOG file action:

Editing Alert Actions — Alert Action: Write a Line to a LOG File

  • Indicate the LOG file path.
  • Enter the content of the line to be written in the LOG file. You can use alert action macros to customize the content and get more information about the problem.

Sending a Basic SNMP Trap

If you have selected the Send a basic SNMP trap action:

Editing Alert Actions — Alert Action: Send a Basic SNMP Trap

Enter the following:

  • IP address or hostname of the SNMP trap destination
  • SNMP port and community string
  • Text that will be sent in the SNMP trap

Upon a hardware failure, Hardware Sentry will send the trap that is defined in the PATROL MIB (Trap number 11, Enterprise ID: 1.3.6.1.4.1.1031.1.1.2, the text is stored in the 1.3.6.1.4.1.1031.1.1.2.1 OID). You can use macros to customize the event message and get more information about the problem.

Sending a Custom SNMP Trap

If you have selected the Send a custom SNMP trap action:

Editing Alert Actions — Alert Action: Send a Custom SNMP Trap

Enter the following:

  • IP address or hostname of the SNMP trap destination
  • SNMP port and community string
  • All the characteristics of the trap: Enterprise ID, trap number and up to 4 varbinds.

You can use alert action macros that will be replaced at runtime.

Sending an Email

If you have selected the Send an E-mail action:

Editing Alert Actions — Alert Action: Send an E-mail

  • Provide the sender and the recipient email addresses in respectively the From and To fields
  • Type the SMTP server name
  • Enter the email Subject and type the Body of the message you wish to send.

Customizing Event Messages

The event messages triggered when a hardware failure is detected can be customized with macros to provide more details about the problem.

  1. Edit the alert action(s) for which you wish to customize the messages.
  2. In the Event Message or Enter the text (…) type of fields, enter the message you would like to be displayed. The following macros can be used to obtain more details about the problem. They will be replaced at runtime.
Macros Description
%{/variable_name} For advanced users only. Value of the variable_name instance variable. Example: %{/worstParam} will give the name of the “worst parameter” of the instance that triggered the alert..
%{ALARM_TYPE} Type of alert triggered (ALARM, WARNING or INFORMATION).
%{ASCTIME:...} Current date and time formatted as specified in the macro. Example: %{ASCTIME:%m %d %T %Y} will produce Aug 25 11:14:53 2018. The available formats for the %{ASCTIME:…} macro are listed in the Format Symbols for %{ASCTIME:…} macros section of this documentation.
%{CONSEQUENCE} Description of the possible consequence of the detected problem. Example: The temperature of the chip, component or device that was cooled by this fan should grow quickly. This can lead to severe hardware damage and system crashes.
%{DATE} Current date in the YYYY-MM-DD format.
%{FULLREPORT} Full hardware health report about the instance that triggered the alert. Example: see the output of the Instant Hardware Health Report Menu command.
%{HOSTNAME} Name of the computer monitored by the PATROL Agent.
%{NEWLINE}, %{\n} Linefeed. This is useful to produce multi-line information.
%{OBJECT_CLASS} Class of the instance that triggered the alert. Example: MS_HW_FAN
%{OBJECT_DEVICEID} Hardware Sentry internal device ID of the instance that triggered the alert. Example:1.1
%{OBJECT_ID} PATROL internal ID of the instance that triggered the alert. Example: MS_HW_DellOpenManagehdf_11
%{OBJECT_LABEL} Display name of the instance that triggered the alert. Example: Fan: 1.1 (CPU1)
%{OBJECT_TYPE} Type of the device that triggered the alert. Example: Fan
%{PARAMETER_NAME} Name of the parameter that triggered the alert. Example: PredictedFailure
%{PARENT_CLASS} Class of the object that the faulty instance is attached to. Example: MS_HW_ENCLOSURE
%{PARENT_DEVICEID} Hardware Sentry internal device ID of the object that the faulty instance is attached to. Example: 1
%{PARENT_ID} PATROL internal ID of the object that the faulty instance is attached to.
%{PARENT_LABEL} Display name of the object that the faulty instance is attached to. Example: Computer: Dell PowerEdge 1600SC
%{PARENT_TYPE} Type of the object that the faulty instance is attached to. Example: Computer enclosure
%{PROBLEM} Description of the problem encountered by the monitored device. Example: The speed of this fan is critically low (1503 rpm).
%{RAW_VALUE} Raw value of the parameter that triggered the alert. Example: 67.30000
%{RECOMMENDED_ACTION} Recommended action to solve the problem. Example: Check if the fan is really no more cooling the system. If so, replace the fan.
%{SYSTEM_DOMAIN} Name of the domain the monitored system belongs to.
%{SYSTEM_FQDN} FQDN (fully qualified domain name) of the monitored system.
%{SYSTEM_HOSTNAME} IP address or full name as specified by the user when adding the new monitored system (i.e. MS_HW_MAIN/<id>/hostname).
%{SYSTEM_IP} IP address of the monitored system.
%{SYSTEM_METAFQDN} MetaFQDN of the monitored system (i.e. FQDN/IPAddress).
%{SYSTEM_NAME} Name of the monitored host or hostname specified while configuring the server monitoring.
%{SYSTEM_TOKENID} TrueSight OM device Token-ID (i.e. MS_HW_MAIN/<id>/MetaTokenID).
%{TIME} Current time in the HH:MM:SS format. Example: 11:14:53. The available formats for the %{TIME:…} macro are listed in the Format Symbols for %{ASCTIME:…} macros section of this documentation.
%{VALUE} Formatted value (with unit) of the parameter that triggered the alert. Example: 67.3 °C

Configuring Internal KM Issues Notifications

The notification feature enables Hardware Sentry to inform you of any internal problem that may occur or any special operation that may be performed.

The objective of these notifications is to help resolve any potential issue, other than hardware problems – which are reported through the regular system of warnings and alerts.

To configure the notification feature:

  1. Right-click the Hardware icon > KM Commands > KM Settings > Additional Settings > Internal KM Issues Notification:

    Internal KM Issues Notification

  2. Select the type of messages you would like to be notified about and how you want to be notified (messages in System Output Window and/or PATROL events):

    • (Default) All internal messages: these messages are sent to the System Output Windows and registered as PATROL events (event ID “HardwareSentryInternal” of the MS_HW_MAIN event catalog).
    • Only internal problems
    • No internal message
  3. Click OK.

Internal errors are registered as “ERROR” (they appear in orange in PATROL Event Manager), internal warnings are registered as “WARNING”, and informative messages are registered as “INFORMATION”.

Configuring Intrusion Detection Alerts

Hardware Sentry triggers alerts if the enclosure/chassis of your managed system has been opened. An unexpectedly opened chassis could imply that the system is physically accessed by an unauthorized person. The Intrusion Detection Alert feature enables you to select the alert conditions.

To configure intrusion detection alerts:

  1. Right-click the Hardware icon > KM Commands > KM Settings > Intrusion Detection Alerts:

    Intrusion Detection Settings

  2. Select the alert to be raised. By default, Hardware Sentry triggers an alert ONLY upon the opening or breach of a closed chassis.

  3. Click OK.

Monitoring network cards communication performance may be critical to ensure that they run at the expected speed in order to avoid network congestion or any other performance issue. By default, Hardware Sentry triggers alerts when a break in the network link is detected for the parameters detailed below. These alerts are raised only if a previously detected network link is broken.

To change the default settings and select when to trigger alerts

  1. Right-click on the Hardware icon > KM Commands > This System’s Settings > Network Link Alerts

    Configuring Network Link Alerts

  2. For each parameter, define what type of alerts should be triggered (an INFORMATION, a WARNING, or an ALARM).

  3. For each parameter, select when you want alerts to be triggered. By default:

    • the LinkStatus parameter triggers a WARNING if a network interface previously connected to the network is detected unplugged. However, it does not trigger any alert for network interfaces that have never been connected.
    • the DuplexMode parameter triggers a WARNING when an adapter that was running in “full-duplex” suddenly starts communicating in “half-duplex”.
    • the LinkSpeed parameter triggers a WARNING when the adapter “slows down”, i.e. re-negotiates the link speed with its remote counterpart, and is forced to downgrade to a slower communication rate.
  4. Click OK.

Disabling Missing Device Detection

Enabled by default, the missing device detection mechanism alerts operators when a device that was previously detected in the system is no longer found. If a device is no longer discovered, its Present parameter goes into alarm.

This mechanism is very useful when, for example, a non-redundant physical disk does not restart during a system reboot and is therefore no longer seen by the operating system and the monitoring software.

The missing device detection feature does not apply to logical disks, voltage and temperature sensors, LEDs, and LUNs.

To disable the missing device detection

  1. Right-click the Hardware icon > KM Commands > KM Settings > Missing Device Detection

    Missing Device Detection

  2. Disable missing device detection: Select this option if you no longer want to be alerted when a device goes missing. Deactivating this feature when some missing devices have already been detected will trigger the removal of these devices from the monitoring environment.

  3. Remove missing devices: If the Disable Missing Device Detection option is unchecked, specify when the missing devices should be removed:

    • Never: devices will continue to appear in the console even when they are detected as missing.
    • Immediately: devices will be removed from the console as soon as they are detected as missing. An alert will however be triggered before deletion.
    • After this number of hours…: devices will be removed when the specified time is reached. Enter the desired time in the text field.

    The history of devices that have been removed will be lost.

  4. Clean-up missing devices on start up: Select this option if Hardware Sentry triggers false “Missing Device” alerts after a start up (reboot) operation. Occasionally, servers reassign IDs to some devices during start up (reboot). When this situation occurs, Hardware Sentry cannot match these devices with their new IDs and irrelevant alerts are triggered. This option configures Hardware Sentry to automatically clean-up the list of previously discovered devices when the Agent starts.

    Devices that went missing during the start up (reboot) operation will not be reported as missing.

  5. Click OK.

Managing Unknown Status

On rare occasions, Hardware Sentry finds an unexpected value for a monitored device and returns the poll with “Unknown Status”. If this situation occurs, you can configure Hardware Sentry to perform specific actions.

To manage the Unknown Status

  1. Right-click the Hardware icon > KM Commands > KM Settings > Additional Settings > Unknown Status Management

    Unknown Status Management

  2. Select an action to perform when the Unknown Status is returned:

    • Set the Status parameter to OK (default)
    • Ignore value and do not feed parameter
    • Trigger a WARNING on the Status parameter
    • Trigger an ALARM on the Status parameter
  3. Trigger an Internal KM Issue Notification: By default, Hardware Sentry triggers an Internal KM Issue Notification that displays an error message in the System Output Window of the PATROL console when an unknown status is collected. Un-select this option to disable the notification.

  4. Click OK.

Resetting ErrorCount Parameters

ErrorCount-type parameters keep increasing as new errors are encountered. When one error occurs, the ErrorCount-type parameter is increased to 1 (one), and retains this value until it is acknowledged. When another error occurs, the ErrorCount-type parameter is automatically increased to 2 (two), and again, it retains this value until it is acknowledged. These errors can be manually or automatically acknowledged.

Manually Acknowledging ErrorCount Alerts

To manually acknowledge ErrorCount alerts, right-click the device icon > KM Commands > KM Settings > Acknowledge ErrorCount Alerts and Reset. Performing this operation demonstrates that you are aware of the encountered error.

Automatically Acknowledging ErrorCount Alerts

Since the manual acknowledgment of each error can rapidly become time consuming, Hardware Sentry can be configured to automatically acknowledge an alert on ErrorCount parameters and reset them to zero after a specified period of time.

  1. Right-click the Hardware icon > KM Commands > KM Settings > Automatic ErrorCount Reset

    Automatic ErrorCount Reset

  2. Select the period of time after which you want the KM to automatically reset the ErrorCount parameters:

    • Never (manual reset)
    • After 1 minute
    • After 1 hour
    • After 6 hours
    • After 24 hours
  3. Click OK.

In many cases, a hardware error that does not reoccur after a certain amount of time can be safely ignored. Automatically resetting ErrorCount parameters to zero after a few hours is often considered as a good practice.

Acknowledging LinkStatus Alerts

The Acknowledging LinkStatus Alert KM command allows you to acknowledge and clear an alert triggered on the LinkStatus parameter when a loss of a network link is detected.

To acknowledge LinkStatus alerts:

  1. Right-click a Network Interface instance.
  2. Select KM Command > Acknowledging LinkStatus Alert.
  3. The LinkStatus parameter is no longer in alert.

By default, Hardware Sentry does not trigger another alert until the Network interface is plugged-in and unplugged.

Preventing False Alerts

Hardware Sentry offers a global advanced alert setting in order to prevent false alerts. You can set/modify the number of times thresholds can be breached before triggering an alert on the following parameters: numeric, discrete, connector status and present parameters.

To prevent false alerts:

  1. Right-click the main Hardware icon > KM Commands > Alert after N Times…

    Alerts After n Times Parameters

  2. For each of the listed parameters, indicate the number of consecutive times a parameter has to stay above (or below) the threshold for an alert to be triggered.

  3. Check the Update existing thresholds with new values option to update all existing Hardware Sentry thresholds with new values. Leave this option un-checked to keep the current Alert After N Times values; however, note that new systems will use the newly specified N Times values.

  4. Click OK.

This setting applies to the parameters of all the monitored devices of the system.