Troubleshooting EMC NetWorker KM
This section describes the basic troubleshooting steps to follow before contacting Customer Support and lists the most common issues.
First Troubleshooting Steps
When you encounter a problem when installing or running EMC NetWorker KM:
- Look for error messages in the PATROL Console System Output Window (SOW) or in the log file NSR_<port>.log. Most error messages are self-explanatory.
- Run the KM Status Report by selecting the menu KM Status from the Server instance or the NetWorker Setup icon. This report lists most KM problems.
- For most severe problems, look for PEM events. They include an Expert Advice, which provides details about the problem and some suggestions to resolve it.
Most Common Issues
KM Behavior is Unchanged After Upgrade
Check the version of the KM from the main Infobox. If it has not changed, then the installation is not complete. Make sure that both the PATROL Console and the PATROL Agent are uninstalled and installed correctly during the KM upgrade
EMC NetWorker Icon Missing After Loading
- Check that the KM is loaded using NSR_LOAD.kml and that NSR_MAIN is loaded.
- Check that there is no KM version mismatch between the PATROL Console and the PATROL Agent. Check the messages in the SOW to verify this.
- Check whether the PATROL Agent tuning variable, /AgentSetup/AgentTuning/pslInstructionMax has been increased as suggested in the Changes Required section. Check the messages in the SOW.
- Check whether the PATROL Agent user has necessary privileges added in the Agent’s Access Control List (/AgentSetup/accessControlList) in order to read and write to the Agent Configuration Database.
Unable to Find NSR_LOAD.kml
- Check that the Load KM browser is looking for *.kml files under the PATROL_HOME/lib/knowledge folder.
- Verify that the EMC NetWorker KM files have been installed correctly under the PATROL installation directory on the PATROL Console.
KM Application Instances Do Not Appear
- Check that the KM instance limits have not been exceeded. Look for error messages in the SOW, and increase the instance limits for affected objects using the menu Configuration > Instance Limits….
- Check whether the server icon is in offline state. None of the data collectors will be executed until the server is enabled and online.
- If the KM is configured for multi-node mode monitoring, some components are not monitored on the passive cluster node.
KM Configuration Menus are Disabled
EMC NetWorker KM can either be configured from a BMC PATROL Console (Classic Mode) or BMC TrueSight Operations Management. When the KM is installed on a PATROL Agent, which is managed by Central Monitoring Administration (CMA), all the KM configuration menus are disabled in the PATROL Console. To configure EMC NetWorker KM from a PATROL Console, you need to force the KM to run in Classic Mode.
When set to run in Classic Mode, EMC NetWorker KM stops receiving configuration from CMA. Any monitoring set in CMA and used by the PATROL Agent is removed and replaced by the configuration made from the PATROL Console. Although policies created in CMA are not deleted, any configuration set in Central Monitoring Administration will be ignored.
To force the KM to run in Classic Mode:
In the PATROL Console, right-click the EMC NetWorker icon > KM commands > Configuration > Force Classic Configuration Mode…
Check Force the KM to run in Classic mode and click OK.
The EMC NetWorker KM will then start running in Classic Mode, enabling you to use the KM Configuration menus.
To configure the KM in TrueSight OM, follow the above procedure and uncheck Force the KM to run in Classic mode. All configurations made through the PATROL Console will then be ignored.
KM Objects Disappear from the Console
- Check whether there is a KM version mismatch between the PATROL Console and the PATROL Agent, possibly after an improper upgrade of the KM. Check the messages in the SOW to verify this.
- Check that the EMC NetWorker KM login details are still valid. Has the password changed on the system? Look for error messages in the SOW, and check for additional information in the last annotation point for parameter NSRLoginStatus.
Old Active Save Groups are not Removed
By default, all active save groups are monitored, and they are exempted from aging. It is possible to change this behavior by unchecking the Keep Monitoring Active Save Groups Indefinitely box in the Save Groups Configuration window accessible from the Save Groups instance Configuration > Save Groups.
Old Acknowledged Save Groups Kept in pconfig
By default, the KM stores all acknowledged save groups. Use the following PSL through PATROL Console to keep only the last <Number> of jobs on <node-id>:
%PSL pconfig("REPLACE", "/NSR/<node-id>/NSR_JOB/jobAcknowledgementCapacity",<Number>);
Replace <node-id> with the appropriate node ID of the NetWorker server.
CPU and Memory Usage is too High
CPU and memory usage will depend on the size and complexity of your environment and your EMC NetWorker KM configuration. As you increase data collection frequency, increase the number of servers and components monitored by the KM, your CPU and memory usage will increase.
EMC NetWorker KM for PATROL may therefore require some fine tuning to optimize the available resources. Consider the following options:
- disabling monitoring unnecessary component instances
- disabling unwanted components by setting their instance limits to 0 (zero)
- disabling unwanted collectors by using the PATROL Configuration Manager
- increasing the collector scheduling interval by using the PATROL Configuration Manager
- decreasing the instance limits to limit the number of instances created by the collectors
The data collectors in EMC NetWorker KM uses EMC NetWorker command line interface to obtain EMC NetWorker information. Most of the performance degradation is associated with these command executions and amount of data returned. It may improve the overall performance, if the regular housekeeping is followed on all EMC NetWorker systems.
When monitoring a NetWorker server through a local PATROL Agent, the EMC NetWorker KM generates minimal network traffic. Most of the data is kept on the managed node. The amount of network traffic that it generates depends on the number of PATROL Consoles that are connected to the PATROL Agent and the frequency of data collection.
When monitoring remote EMC NetWorker servers, some network traffic will be observed as it transfers the commands result over the network. The traffic depends on the amount of data polled during each command execution. When commands are expected to return large output, the KM is designed to use file transfers through SFTP (on UNIX/Linux) and Windows file shares (on Windows).
Parameters and Application Classes Refresh Takes too Long
Data collectors run according to their scheduling interval (polling cycle) defined in the KM. These intervals are defined for a standard environment with minimal resource impact. Intervals can be customized from the PATROL Developer Console or PCM to suit your environment requirements. Refer to the PATROL Console User Guide for more details.
Poor Performance of the Server/Storage Node
The performance of the Server/Storage Node may change after installing the EMC NetWorker KM on a heavily used system. Depending on the complexity of your EMC NetWorker environment, the KM may consume more resources to interrogate the application and process the data. In such a complex environment, the EMC NetWorker KM may require some fine tuning to optimize the available resources. Consider the following options:
- Disable the monitoring of unnecessary application instances. Refer to the section Filtering Elements to Monitor for more details
- Increase the scheduling interval (polling cycle) for data collectors
- Deactivate monitoring non-critical components by setting the Instance Limits to 0 (zero)
- Deactivate unnecessary data collectors during selected time intervals, where there is no EMC NetWorker activity. For example, if the save group monitoring can be disabled between 9 am and 4 pm everyday, except weekends, then disable save group data collector (NSRSaveGrpCollector) during this period, using the following PSL through PATROL Console:
Here the pconfig variable is named as: <collector name>Mode. Replace <node-id> with the appropriate node ID of the EMC NetWorker server. The value contains the following details, delimited by pipe (|):enabled (1)/disabled (0) data collection, default start/end times in number of seconds since midnight, start/end times for non-default days starting from Sunday through to Saturday.
The JOB_TEXT command which sets display only text parameter NSRSaveGrpText, can be disabled to improve the performance using the PSL below. Replace <node-id> with the appropriate node ID of the EMC NetWorker server and restart the PATROL Agent:
%PSL pconfig("REPLACE", "/Runtime/NSR/<node-id>/NSR_JOB/jobCollectText", 0);
As part of collection, the collector compares each job against previous similar backup to calculate the progress data. In addition the last backup info is shared under NSR_POLICY parameters (NSRPolicy*Backup*) to monitor the success at the policy level. This functionality can also be disabled to speed up the collector, using the PSL below. Replace <node-id> with the appropriate node ID of the EMC NetWorker server and restart the PATROL Agent:
%PSL pconfig("REPLACE", "/Runtime/NSR/<node-id>/NSR_JOB/jobCollectLastBackupDetails", 0);
Defining a “no command execution window” for all collectors will pause running commands at peak times or during EMC NetWorker maintenance windows. This can be set using the PSL below. Replace <node-id> with the appropriate node ID of the EMC NetWorker server and restart the PATROL Agent:
%PSL pconfig("REPLACE", "/Runtime/NSR/<node-id>/noExecuteWindow","23:59:00|120");
The value of this permanent configuration variable is in format <start time in HH:MM:SS 24-hour clock>|<duration in seconds>. The above 23:59:00|120 sets all collectors to sleep between 23:59:00 and 00:01:00 (2 minutes) every day before executing commands. Also, this noExecutewindow supports multiple time windows:
%PSL pconfig("REPLACE", "/Runtime/NSR/<node-id>/noExecuteWindow",["23:59:00|120","11:59:00|120"]);
Replace <node-id> with the appropriate node ID.
Purge unnecessary information in EMC NetWorker catalog databases and log files.
- If there are too many clients configured in EMC NetWorker, the NSRClientCollector and NSRGroupCollector may affect the overall performance. In such environment disable the NSRClientCollector, or set their instance limits to 0 (zero), using menu Configuration > Instance Limits.
- Refer to the Infinite Loop Errors section below for a possible PATROL internal scheduling delay which may impact the performance of the KM.
Infinite Loop Errors
If error messages in the SOW reports that some EMC NetWorker KM data collectors may be in an infinite loop, check the setting of the tuning variable /AgentSetup/AgentTuning/pslInstructionMax.
The PATROL Agent uses the pre-configured tuning variable (/AgentSetup/AgentTuning/pslInstructionMax) to stop running PSL functions in an infinite loop. When a PSL function reaches this maximum threshold, it reports this error, and puts the execution of this function to the back of the process queue. This will not only delay the data collector, it will also impact the performance of the system.
To resolve this situation, the maximum number of instructions should be increased to an optimum value. This depends on the complexity of your environment. It is required that the default value of 500,000 should be increased to at least 5,000,000 on a standard EMC NetWorker environment to enable the EMC NetWorker KM data collectors to execute without impacting your system.
If this still does not resolve the problem, you can disable this functionality by setting the value of the tuning variable to 0 (zero).
Debug Mode cannot be Activated
If the Debug Mode cannot be activated by following the method described in the Configuring the Debug Mode section, you can turn the debug on by setting an appropriate PATROL Agent configuration variable with a timestamp value. This timestamp value determines when the debug should be turned off. For example, to turn on the debug for 60 minutes from now, run the following PSL through PATROL Console:
time()+3600);``` Where ```<component>``` is either the server for debugging the server discovery or the name of the collector component (in lower case) followed by ```Collector```, like ```daemonCollector``` for debugging the daemon collector. Replace ```<node-id>``` with the node ID of the NetWorker server. This parameter will show a “suspicious” state if any command executed by the **EMC NetWorker KM** fails. * Check the annotation point on the first state change data point of this parameter to look for failing commands. If an annotation point cannot be found, or if it is not up-to-date, check the **KM Status Report**, which can be viewed by selecting the menu **KM Status** from the server icon. These errors are produced from the EMC NetWorker commands executed by the **EMC NetWorker KM** . * Check that the operating system user configured in the menu **Configuration > Login** can execute all EMC NetWorker commands and access the EMC NetWorker files.
storage networker km patrol