Alert Messages and Actions
This section describes how to define thresholds and severity alert levels, choose notification delivery methods and specify the troubleshooting or recovery actions you want to implement when a problem is detected on a monitored technology.
Configuring Alerts
You can configure any numeric parameter of a Monitor to trigger alerts, send notifications and take specific actions when certain conditions are met with the Parameters and Alerts properties.
There are three alarm ranges, Alarm #1, Alarm #2 and Out-of-range, each with a minimum and a maximum value:
- Use the Alarm #1 and Alarm #2 options to define the range of parameter values that triggers warnings and alarms.
- Use the Out-of-range border conditions to be informed when the collected values are outside the norm (less than or greater than the defined range limits)
| Parameters and Alerts | Description |
|---|---|
| Alarm #1 | |
| Severity Level | To turn off the alert or set the severity level (INFO, WARN, ALARM). |
| Threshold | To set the parameter threshold boundary values and the number of times the threshold must be breached before the Monitor triggers a first-level alert (Immediately or x times in a row). |
| Occurrence | To define the number of consecutive times the parameter reports a value within the alarm #1 range before the alert is triggered. An alert is automatically triggered upon threshold breach if the occurrence is set to Immediately. |
| Alarm #2 | |
| Severity Level | To turn off the alert or set the severity level (INFO, WARN, ALARM). |
| Threshold | To set the parameter threshold boundary values and the number of times the threshold must be breached before the Monitor triggers a second-level alert (Immediately or x times in a row). |
| Occurrence | To define the number of consecutive times the parameter reports a value within the alarm #2 range before the alert is triggered. An alert is automatically triggered upon threshold breach if the occurrence is set to Immediately |
| Out-of-Range | To specify the range of values within which the parameter is considered to operate normally. The border range must be larger than the Alarm #1 and Alarm #2 ranges combined. Values are described as less than and greater than, for example <0 or >100. Values falling inside this range DO NOT trigger any warning, alarm, or recovery action. |
| Severity Level | To turn off the alert or set the severity level (INFO, WARN, ALARM). |
| Threshold | To set the parameter threshold boundary values and the number of times the threshold must be breached before the Monitor triggers an alert (Immediately or x times in a row). |
| Occurrence | To define the number of consecutive times the parameter reports a value within the alarm1 range before the alert is triggered. An alert is automatically triggered upon threshold breach if the occurrence is set to Immediately. |
| Alert and Acknowledgement Message | To customize the alert and acknowledgement notification messages. By default, the Monitor sends the message defined in Studio > Studio Settings. You can use Alert Messages Macros to create a message tailored to your needs. |
| Alert Action | To define the actions you wish Monitoring Studio X to undertake when a problem is detected. You can refresh any Monitor of a same Template when a parameter triggers an alert. The Monitor you choose to execute as an Alert Action can help you identify and address issues before they become critical. You get the flexibility to decide what to do when a problem occurs, and even get some feedback about the taken actions. For example you can set up a chain of actions, where the Monitor of the parameter in alert triggers a troubleshooting Monitor, which triggers another repairing Monitor, etc.
Tip: Prefix the Monitors you are most likely to use as Alert Actions (for example: AA_) to rapidly locate them in the list of Monitors. It is recommended to set
Collect Schedule of the Alert Action Monitor to
Run Manually to ensure this Alert Action is executed only when a problem is detected.
|
Template Thresholds and Customized Thresholds
The alert thresholds defined in a Template are applied to all hosts using this Template, unless the parameter thresholds have been customized:
- through the Console view
- through the Agent > Agent Thresholds menu
- in a TrueSight CMA policy applied to this PATROL Agent
This way, you define default alert thresholds in the Template and operators and administrators can customize these thresholds on a specific instance.
Overlapping Alarm Ranges and Precedence
It is possible for alert ranges to overlap, in which case the value of a parameter may be within more than one range at the same time. The status of the parameter will be the state of the first range set following this order:
- Out-of-Range (Border)
- Alarm #1
- Alarm #2
It is therefore important to make sure the most critical range (“ALARM”) takes precedence on a less critical range (“WARN”).
Example: If you want a “WARN” event to be triggered when Processor Utilization is greater than 75% and an “ALARM” event when it is greater than 90%, you will set alert thresholds as below:
- Alarm #1: ALARM if Processor Utilization > 90
- Alarm #2: WARN if Processor Utilization > 75
As an alternative, you can set the WARN range between 75 and 90, in which case there will be no overlap. But with this configuration, a new WARN event is triggered when the parameter value goes from 93% (ALARM) to 81% (WARN). Both options are valid and have their pros and cons.
Alert and Acknowledgement Messages
You can define the way Monitoring Studio X notifies you when alert conditions are detected on a monitored parameter or when its status is back to normal. Alert and acknowledgement notification delivery options and default messages can be configured from Studio > Studio Settings and apply to all the Monitors running on the Agent.
| Alert Delivery Options | Description |
|---|---|
| Annotation | To display a message at the annotation point of the parameter graph. |
| PATROL Event | To customize PATROL Event types and related messages, by:
|
| Command Line | To execute a command line on the system where the PATROL Agent is installed. |
| To send the email to one or multiple recipients. Email addresses must be comma (,) or semi-column (;) separated. | |
| OS Command | To execute an OS command on the Agent. |
| PSL Script | To specify the PSL statement to be executed locally by the Agent. |
| Log File Entry | To add a user-defined entry to the Log file. |
| SNMP Trap | To send an SNMP Trap. |
A Default Message Content can be configured per class of parameters and used when no specific alert message is defined at the Monitor level. By default, default messages are provided for a selection of classes (Host, Numeric Value Extraction, String Search, etc. ). They contain basic information to provide a comprehensive report on the origin of the problem detected. These messages are used by default for all the parameters of an application class when you choose to use the Default Message property in Studio > Template/Monitor > Parameters and Alerts.
You can customize the default message content with Alert Messages Macros to keep your message dynamic and contextual. You can also delete all the provided class-specific default messages. In that case, the customizable message defined in the Other Classes option will automatically be used for all classes of parameters that are set to send a message upon a threshold violation.
Finally, you can create and customize your own default class messages to fully meet your notification policies or requirements.
Alert Messages Macros
You can use macros to customize the content of alert messages. For example: %{PARAMETER_VALUE} is replaced by the actual current value of the parameter that triggered the alert. Each macro listed in the tables below contains information about the problem that triggered the alert.
General Macros
The macros listed in the table below can be used on any object.
| General Macros | Description |
|---|---|
| %{CREDENTIALS} | Label of the credentials used by the monitor. |
| %{COMMA} | Inserts a comma. |
| %{CONVERTED_UNIT} | Converted unit, if convert units is selected. |
| %{CONVERTED_VALUE_UNIT} | Converted value (rounded to 2 decimal value) followed by the converted unit, if any. This converted value is provided before any rescaling, but after calculating sum, min, max, etc. |
| %{CONVERTED_VALUE} | Converted (if convert units is selected) or extracted value, before any rescaling, but after calculating sum, min, max, etc. |
| %{DATE} | current date in the YYYY-MM-DD format. |
| %{EOL} | Inserts carriage return. |
| %{EXTRACTED_LINE} | Content of the extracted line (X_DYNAMIC and X_NAGIOSPERF classes only). |
| %{HOST_DOMAIN} | Domain of the targeted host. |
| %{HOST_FQDN} | Fully qualified domain name of the targeted host. |
| %{HOST_IPADDRESS} | IP address of the targeted host. |
| %{HOST_NAME} or %{HOSTNAME} | Name of the targeted host. |
| %{HOST_SNMPCOMMUNITY} | SNMP community set for the SNMP Agent on the targeted host. |
| %{HOST_SYSTEMTYPE} | Operating system type of the targeted host. |
| %{INFORMATION} | Provides additional information about the problem/alert. |
| %{INFORMATION_ONELINE} | Provides additional information about the problem/alert in a single line. |
| %{OBJECT_CLASS} | Class name of the object to which the alert action belongs. |
| %{OBJECT_ID} | PATROL ID of the object triggering the alert. |
| %{OBJECT_LABEL} | Display name of the object triggering the alert. |
| %{OBJECT_TYPE} | Type of the object triggering the alert (“Process”, “String”, etc.). |
| %{PARAMETER_ALARM1MAX} | Alarm1 maximum range of the parameter triggering the alert. |
| %{PARAMETER_ALARM1MIN} | Alarm1 minimum range of the parameter triggering the alert. |
| %{PARAMETER_ALARM1NTIMES} | Number of consecutive times the parameter triggering the alert must have a value within the alarm1 range before the alert occurs. |
| %{PARAMETER_ALARM1TYPE} | Alarm alert type of the parameter triggering the alert (OK, WARN, ALARM). |
| %{PARAMETER_ALARM2MAX} | Alarm2 maximum range of the parameter triggering the alert. |
| %{PARAMETER_ALARM2MIN} | Alarm2 minimum range of the parameter triggering the alert. |
| %{PARAMETER_ALARM2NTIMES} | Number of consecutive times the parameter triggering the alert must have a value within the alarm2 range before the alert occurs. |
| %{PARAMETER_ALARM2TYPE} | Alarm2 alert type of the parameter triggering the alert (OK, WARN, ALARM). |
| %{PARAMETER_BORDERMAX} | Border maximum range of the parameter triggering the alert. |
| %{PARAMETER_BORDERMIN} | Border minimum range of the parameter triggering the alert. |
| %{PARAMETER_BORDERNTIMES} | Number of consecutive times the parameter triggering the alert must have a value outside the border range before the alert occurs. |
| %{PARAMETER_BORDERTYPE} | Border alert type of the parameter triggering the alert (OK, WARN, ALARM). |
| %{PARAMETER_NAME} | Name of the parameter triggering the alert. |
| %{PARAMETER_STATUS} | Status of the parameter. |
| %{PARAMETER_TITLE} | Title of the parameter. |
| %{PARAMETER_UNITS} | Units of the parameter. |
| %{PARAMETER_VALUE} | Value of the parameter triggering the alert. |
| %{PARENT_CLASS} | Class name of the parent object to which the alert action belongs. |
| %{PARENT_ID} | PATROL identifier of the object’s parent. |
| %{PARENT_LABEL} | Display name of the object’s parent. |
| %{PARENT_TYPE} | Type of the object’s parent. (“File”, “CommandLine”, etc.). |
| %{PASSWORD} | Encrypted password of the targeted host. |
| %{RESULT} | Query result received for the monitored object during data collection, when available. |
| %{SEMICOLON} | Inserts a semicolon. |
| %{STATUS_INFORMATION} | Provides additional information about the Status, as reported by the StatusInformation parameter (where available). |
| %{TIME} | Time in HH:MM:SS format. |
| %{USERNAME} | Username defined in the monitor’s credentials. |
| %{VALUE} | The current value of the parameter triggering the alert. |
| %{/<variable path>} | Value of the PATROL Agent namespace variable relative to the parameter that triggered the alert (ex: %{/hostname}, %{../sid}, %{/osName}, %{../osCommand}) |
Monitor-Specific Macros
The macros listed below can be used in alert messages and are specific to their respective object type.
| Object-Specific Macros | Description |
|---|---|
| Command Line | |
| %{COMMAND_LINE} | Command line being executed and analyzed. |
| %{EXIT_STATUS_CODE} | Exit status returned by the system after executing the command. |
| Database Monitor | |
| %{DATABASE_NAME} | Name of the database the SQL query is sent to. May be the database name for SQL Server, or the Oracle SID for Oracle. |
| %{DATABASE_TYPE} | Type of the database. |
| %{QUERY} | SQL statement sent for execution. |
| Dynamic Object | |
| %{RESULT} | Returns the output of the dynamic object. |
| Dynamic Value Map | |
| %{RETAINED_VALUE} | Value retained by the collect and mapped to a status. |
| %{MAPPED_STATUS_INFORMATION} | Provides additional information about the Status of the value mapping result, as reported by the StatusInformation parameter. |
| File | |
| %{FILENAME} | Name of the monitored file as entered in the GUI. |
| %{MONITORED_FILE} | Current file being monitored. |
| MFile System | |
| %{FILESYSTEM} | Name of the monitored file system. |
| Folder | |
| %{FOLDER} | Folder being monitored. |
| %{OLDEST_FILES_IN_FOLDER} | Name of the oldest file in the folder. |
| Host | |
| %{AVAILABILITY_CHECKS} | List of configured availability checks, separated by commas. |
| %{CREDENTIALS_LIST} | List of credentials, separated by commas. |
| %{SIGNATURE_FILES} | List of signatures files, separated by commas. |
| %{TCP_PORT} | Port number used for the TCP availability check. |
| Multi-Parameter Formula | |
| %{FORMULA} | User-defined formula used to calculate the parameter value. |
| Process | |
| %{COMMAND_LINE} | Process command line being searched for, as entered in the GUI. |
| %{MATCHING_PROCESSES} | List of all matching processes. |
| %{PID_FILE} | Path to the PID file whose corresponding process is being monitored. |
| %{PROCESS_NAME} | Process name being searched for, as entered in the GUI. |
| %{USER_ID} | Process user ID being searched for, as entered in the GUI. |
| %{WORST_PROCESS_COMMANDLINE} | Command line of the first worst process. |
| %{WORST_PROCESS_NAME} | Name of the first worst process. |
| %{WORST_PROCESS_PID} | PID of the first worst process. |
| %{WORST_PROCESS_PPID} | PPID of the first worst process. |
| %{WORST_PROCESS_STATE} | State of the first worst process. |
| %{WORST_PROCESS_USERNAME} | Username of the first worst process. |
| %{WORST_PROCESSES} | List of worst processes, semicolon delimited, containing PID, process name, username, PPID, state and command line. |
| SNMP Polling | |
| %{CONTENT} | Value of the OID being polled. |
| %{OID} | SNMP OID being polled. |
| SNMP Polling macros | |
| %{CONTENT} | Content of the received trap. |
| %{ENTERPRISE_ID} | Enterprise ID (OID) of the SNMP traps being looked for. |
| %{FOUND_IP} | Actual originating IP address of the trap that has been received. |
| %{FOUND_TRAP_NUMBER} | Actual SNMP trap number that has been received and matches the entered criteria. |
| %{TRAP_NUMBER} | SNMP Trap numbers (specific numbers) being looked for. |
| String Search Macros | |
| %{LAST_MATCHING_LINE} | Last line that matches with the string search criteria. |
| %{LAST_MATCHING_LINES} | Last 10 lines that match with the string search criteria. |
| %{STRING1} | First regular expression being searched for. |
| %{STRING2} | Second regular expressions being searched for. |
| Template | |
| %{TEMPLATE_NAME} | Name of the Template related to the parameter triggering the alert. |
| %{TEMPLATE_CLASS} | Template’s application class name. |
| %{TEMPLATE_COLLECTIONERRORS} | List of collection errors that occurred between the current collect and the previous one. |
| %{TEMPLATE_CONTACT} | Contact information in case of a Template failure. |
| %{TEMPLATE_DESCRIPTION} | Description of the Template as provided in the Web interface. |
| %{TEMPLATE_ID} | PATROL ID of the Template triggering the alert. |
| %{TEMPLATE_LABEL} | Display name of the Template triggering the alert. |
| %{TEMPLATE_TYPE} | Type of the Template triggering the alert (Template). |
| Value Map Macros | |
| %{RETAINED_VALUE} | Value retained by the collect and mapped to a status. |
| %{MAPPED_STATUS_INFORMATION} | Provides additional information about the status of the value mapping result, as reported by the Status Information parameter. |
| WBEM Query Macros | |
| %{NAMESPACE} | Namespace of the WBEM query. |
| %{QUERY} | WBEM statement sent for execution. |
| Web Request Macros | |
| %{HTTP_METHOD} | GET or POST depending on what was selected in the GUI. |
| %{URL} | URL being tested. |
| WMI Macros | |
| %{NAMESPACE} | Namespace of the WMI query. |
| %{QUERY} | WMI statement sent for execution. |
| Windows Event Macros | |
| %{CONTENT} | Message content of the last matching event. |
| %{EVENT_ID} | ID of the Windows events being searched for. |
| %{EVENT_LOG} | Name of the Windows event log being monitored. |
| %{MATCHING_EVENTS} | List of matching events. |
| %{PROVIDER} | Windows Event source whose new entries are monitored. |
| %{RECORD_NUMBER} | Last matching event record number |
| Windows Performance Macros | |
| %{PERFORMANCE_COUNTER} | Windows performance counter being monitored. |
| %{PERFORMANCE_INSTANCE} | Windows performance object instances being monitored. |
| %{PERFORMANCE_OBJECT} | Windows performance object name being monitored. |
| Windows Service Macro | |
| %{SERVICE_NAME} | Name of the monitored Windows service. |
studio km patrol develop web