Release Notes v1.9.50

•	Two new configuration variables, removeAllThresholds and trimFromDisplayName, are available to respectively delete all existing thresholds from the configuration and specify the characters to be removed from the object display names.

Supported Platforms

New Supported Platforms

•	Hitachi (HDS) AMS, HUS Storage Systems (monitored through the Hitachi Storage Navigator Modular 2 CLI).

•	Oracle/Sun servers (monitored through the Oracle Hardware Management Agent - Recommended method)

Improved Platforms

•	Non-port components for Cisco Ethernet Switches (power supplies, fans, temperature sensors, and voltages) are supported. They are monitored via SNMP.

•	TrueSight OM - Hardware reports the status of the System Attention LED and triggers an alert if a hardware problem has been reported on each system since 0:00am.

•	Logical disks (hdisks), physical disks (pdisks) and batteries managed by sissasraidmgr are supported.

•	SPARC Enterprise Mx000 (XSCF): Negative voltage sensors, such as -12V, are discovered.

•	Individual memory sensors listed under "Other" sensors in the vCenter/vSphere configuration tab are supported.

Monitored Components

•	The Power State attribute is available to indicate whether the blade is currently on or off for Dell Blade Servers, Hitachi BladeSymphony Chassis, HP BladeSystem rack, and IBM BladeCenter chassis.

Changes and Improvements

Functionality

•	Improved stability: The product better handles error conditions on a large number of hosts.

•	On Windows systems, the monitoring solution could take a very long time to initialize upon startup and after a reinitialize.

•	When the PATROL Agent is managed from CMA, the default thresholds mechanism is now set to "tuning" to automatically generate alarms/events.

•	TrueSight OM - Hardware can be configured to automatically delete missing instances after a certain time.

•	Additional information (BIOS Version, Driver Version, Manufacturer, etc.) is provided in the Hardware Health Report and in events triggered by the Alert Actions.

•	Specific PATROL Events triggered upon Hardware and Connector failures also indicate the alert origin (monitor type name).

•	By default, TrueSight OM - Hardware will no longer trigger an event when an internal issue occurs.

•	The debug file and the Hardware Inventory now report the total number of instances for each host for every monitor types.

•	The maximum number of concurrent collection threads is now set per host through the maxConcurrentCollectThreadsPerHost global configuration variable. The maxConcurrentCollectThreads variable is therefore no longer supported.

Supported Platforms

•	Fujitsu Eternus: TrueSight OM - Hardware provides more detailed information about Fujitsu Eternus systems and uses significantly less resources.

▪	TrueSight OM - Hardware collects a real time power consumption value from either the VMware ESX CIM agent or HP Insight Management Agent for VMware ESX.

▪	All temperature thresholds of zero have been removed to avoid unwanted temperature alerts.

•

When HP servers did not return an overall power consumption, the monitoring solution disabled the corresponding attribute and made an estimate in the capacity report. When this information is not available, the monitoring solution will now try to sum up the power consumption of all power supplies before falling back on the estimate.

•	IBM BladeCenter Chassis: Embedded switches, passthroughs, and management modules are now monitored.

▪	Additional information is available fo LUNs (WWN, Array Name, Hardware Location Code, and Expected Number of Paths).

▪	The Status of the System Attention LED is now reported and an alert is triggered when this LED is turned on.

•	On IBM AIX and VIO servers, the system is now fully identified with its hardware ID, LPAR ID, system ID and model name. More details have also been added to several components (disks, network cards, FC ports, and CPUs) to facilitat their identification in case of a failure.

•	IBM VIO Server systems are better identified with their model name, code, IDs, etc.

▪	Additional information is available for physical disks (vendor, model, serial number, firmware, etc.).

•	Quantum Scalar i2000 and i6000: The components visible identifiers are now based on sensor names / locations /etc.

•	SUN SPARC Enterprise Mx000 (XSCF): False voltage alerts were triggered due to incorrect thresholds.

•	SUN SPARC Servers (Prtpicl): A smaller version of the device ID is used for the display ID to enable easier sensor identification.

•	VMware ESX: The monitoring of power supplies has been improved. Both VMware ESX health and availability status are used to determine the health of the power supply.

•	Disk Monitoring on WMware ESX servers (IPMI): Because some servers use the same IPMI Monitored Device ID for all physical disks, the monitoring solution uses the IPMI Device ID to group sensors for each physical disk. The physical disk's caption is now used as the Display ID.

Monitored Components

•	Logical Disks, Temperature, Voltage and LED instances will be automatically deleted in the BMC TrueSight OM Console as soon as they are detected as "missing".

•	A more reliable method is being used to associate batteries to their related disk controllers.

•	Harware Sentry now uses Counter64 OIDs for Ethernet switches equipped with a MIB-2 standard SNMP Agent.

•

HBA Cards Monitoring on all Windows-based systems : The LUN's naa.ID is now used to identify LUNs. Using this naa.ID helps link LUNs to the Storage System's volume as they share this unique identifier code. The cdisk's Windows MPIO ID as well as the drive letters and partition names of any volumes on that LUN are now also provided. A typical LUN ID will therefore read: naa.60616043312F05A4308DC65F111 (MPIO Disk0 - C:(OS) D:(Data))

▪	A warning is triggered when the number of available paths is one path lower than the initial number of available paths.

▪	The problem type, consequences, and recommended actions are provided when an alert is triggered on the Status and Available Path Count attributes.

▪

Windows MPIO LUNs Monitoring: Because Windows regularly changes the unique identifiers of LUNs and physical disks, false missing/present alerts could occur for LUNs and duplicate instances could appear. To solve this issue, the monitoring solution now uses the LUN's naa.ID, which is unique and does not change.

▪	Default thresholds are set on the Error Percent attribute of the Hardware Network Interface monitor type (≥ 10% = warning, ≥ 30% = alarm).

Fixed Issues

Functionality

•	The product could freeze or stop working in case of repeated discovery timeouts, when reinitializing the KM on large environments or when too many connectors failed at the same time.

•	In some situations, TrueSight OM - Hardware would not activate the Bandwidth Utilization attribute even if it could collect the network’s bandwidth utilization.

•	When using a version of the PATROL Agent older than 9.0, the Monitor Type for the Hardware LUNs was missing.

Supported Platforms

•	Data Domain Storage Systems: Due to the structure of the Data Domain MIB, specific strings in a Physical Hard Drive's serial number could cause a disk to report an unknown status.

•	Dell PowerEdge Servers: the physical disk instances were not attached to the proper disk controller instance.

•	Dell TL2000/4000 and IBM TS3100/3200 Tape Libraries: Tape drive mounts were incorrectly reported as errors, which resulted in false alerts to be triggered on the Error Count attributes.

•	EMC Isilon Systems monitoring: Time stamped log files would fill up the filesystem. These log files will now be sent to /tmp/MS_HW_isi_hw_check without a timestamp to solve this issue.

▪	The execution of all commands is fully serialized to prevent conflicts. All temporary files used by the batch files/shell scripts use randomized file names to prevent file locks and missing files.

▪	Logical Disk Status was not collected for some systems when the command output format was not supported by the monitoring solution.

▪	Disk controllers and their batteries are now properly discovered even when no information on their model or serial number is available.

▪	Thresholds labeled as "Critical" were often only “Warning” temperatures in HP's Insight Manager Agent. The monitoring solution now detects this problem and sets the right thresholds.

•	HP Servers Running Windows: Disk controllers and their batteries are now properly discovered even when no information on their model or serial number is available.

•	HP-UX System: In some cases, the value of the Error Percent attribute of the Network monitor type was not reported correctly.

▪	Network statistics were not collected for physical ports that were part of SEA Virtual Adapters.

▪	Network delays had been observed when the enstat command used to collect Ethernet ports statistics on IBM AIX servers was run on disabled ports.

▪	HBA ports were only considered active if a tape drive or hard disk was attached to them. HBA ports will now be considered active if an enabled path is associated to them.

▪	Ports that were used as failover ports by MPIO were considered disabled. This caused false link down alerts and stopped the monitoring of ports that were in fact active.

•	IBM Storwize (SSH): LEDs were not reporting all faults on both v3700 and v7000 systems.

•	IBM x Series Servers: On rare occasions, duplicate processor instances could appear in your monitoring environment because the IBM Director Agent reported each processor twice.

•	The SEL Fullness sensor is now excluded to avoid getting SEL Fullness alerts when monitoring an IBM server using IPMI.

•	The BIST_FAIL sensor is now excluded to avoid getting false CPU alerts when monitoring a Cisco UCS Blade.

▪	Disks branded as Sun and larger than 1TB were excluded from the discovery because the expected product tag was "SUNxxxG".

▪	Due to a recent modification in the psrinfo command output, cores were reported as full processors. They are now grouped under a single physical CPU.

▪	(Prtpicl): No thresholds appeared for fan sensors when LowWarningThreshold did not exist for fan instances. LowPowerOffThreshold will now be used whenever this situation occurs.

▪	Sun SPARC servers (Running Solaris): Invalid values were reported or false alarms were triggered for temperature and voltage sensors.

▪	Authentication failures for some ESXi servers could occur when monitoring VMware ESXi servers using vCenter as a multi-tier authentication server.

▪	The port status for link down ports had been modified in VMware ESX 5.5, which caused the VMware ESXi 4.x connector to falsely report port failures.

Monitored Components

•	Devices classified as “Other Devices” (CP Modules, etc.) are now attached to their respective enclosures.

•	Emulex HBA monitoring failed when hbacmd was not installed in /usr/sbin/hbanyware/hbacmd. The monitoring solution will now run the command without the full path. Please note that this modification requires hbacmd to be added to the PATH environment variable of the user used to monitor the server.

•	The solution monitors LSI sas2ircu-Managed RAID Controllers even though the manufacturer's agent does not report its status or the agent is not installed.

•	Multiple instances of the same LSI RAID Controller could appear in the monitoring environment.

▪	TrueSight OM - Hardware rounded logical disks size for disks bigger than 1 TB (e.g.: the size of a 1.4 TB logical disk was displayed as 1 TB). The size of disks bigger than 1 TB is now rounded to one decimal place.

▪	A thread/handle leak could occur in the VDS.EXE process (Virtual Disk Service) when monitoring logical disks in a Microsoft Windows system and could cause the corresponding service to crash.

▪	When servers (typically HP ProLiant) do not report sizes of physical disks, the monitoring solution queries the associated storage extents to find the actual disk size.

▪	The monitoring solution failed to interpret the status of Non-RAID disks (reported as Unknown instead of OK)

•	Redundant fans sometimes reported a speed/speed percent reading of zero, which triggered an alert even if no thresholds were set. The monitoring solution now disables the speed/speed percent attributes if a valid status is collected to avoid this issue while maintaining full monitoring.

•	Windows MPIO LUNs Monitoring: Because Windows regularly changes the unique identifiers of LUNs and physical disks, false missing/present alerts could occur for LUNs and duplicate instances could appear.

▪	If a Windows server had both LUNs and local non-RAID physical disks, then TrueSight OM - Hardware monitored both as local physical disks. LUNs will now be excluded from the monitoring.

•	Because the monitoring solution was unable to report problems on logical disks in Windows environments, logical volumes are no longer displayed for non-English versions of Windows.

•	Some physical disks were missing when monitoring Linux / Solaris servers with Adaptec StorMan managed RAID cards.

BMC TrueSight Operations Management - Hardware

What's New

Functionality

Supported Platforms

New Supported Platforms

Improved Platforms

Monitored Components

Changes and Improvements

Functionality

Supported Platforms

Monitored Components

Fixed Issues

Functionality

Supported Platforms

Monitored Components