-
Home
- Configuration
Configure the Hardware Sentry Agent
The Hardware Sentry Agent collects the hardware health of the monitored systems and pushes the collected data to the OTLP receiver. Hardware Sentry OpenTelemetry Collector then processes the hardware observability and sustainability metrics and exposes them in the backend platform of your choice (Datadog, BMC Helix, Prometheus, Grafana, etc.).
To ensure this process runs smoothly, you need to configure a few settings in the config/hws-config.yaml
file to allow Hardware Sentry OpenTelemetry Collector to:
- identify which site is monitored with this agent
- calculate the electricity costs and the carbon footprint of this site
- monitor the systems in this site.
Note that all changes made to the config/hws-config.yaml
file are taken into account immediately. There is therefore no need to restart the OpenTelemetry Collector.
Configure a site
A site represents the data center or the server room in which all the systems to be monitored are located. Configure your site in the extraLabels
section of the config/hws-config.yaml
file as shown in the example below:
extraLabels:
site: boston
Configure the sustainability settings
To obtain the electricity costs and carbon footprint of your site, configure the extraMetrics
section of the config/hws-config.yaml
file as follows:
extraMetrics:
hw.site.carbon_density_grams: 350 # in g/kWh
hw.site.electricity_cost_dollars: 0.12 # in $/kWh
hw.site.pue_ratio: 1.8
where:
hw.site.carbon_density_grams
is the carbon density in grams per kiloWatthour. This information is required to calculate the carbon emissions of your site. The carbon density corresponds to the amount of CO₂ emissions produced per kWh of electricity and varies depending on the country and the region where the data center is located. See the electricityMap Web site for reference.hw.site.electricity_cost_dollars
is the electricity price in dollars per kiloWattHour. This information is required to calculate the energy cost of your site. Refer to your energy contract to know the tariff by kilowatt per hour charged by your supplier or refer to the GlobalPetrolPrices Web site.hw.site.pue_ratio
is the Power Usage Effectiveness (PUE) of your site. By default, sites are set with a PUE of 1.8, which is the average value for typical data centers.
Configure the monitored hosts
To collect metrics from your hosts, you must provide the following information in the config/hws-config.yaml
file:
- the hostname of the host to be monitored
- its type
- the protocol to be used.
Important: Because a typo or incorrect indentation in the
hws-config.yaml
file could cause your hardware monitoring to fail, it is highly recommended to install the vscode-yaml extension in your editor to benefit from tooltips and autocompletion suggested by the Hardware Sentry Configuration JSON Schema.
Monitored hosts
Systems to monitor are defined under hosts
with the below syntax:
hosts:
- host:
hostname: <hostname>
type: <host-type>
<protocol-configuration>
where:
-
<hostname>
is the name of the host, or its IP address -
<host-type>
is the type of the host to be monitored. Possible values are:win
for Microsoft Windows systemslinux
for Linux systemsnetwork
for network devicesoob
for Out-of-band management cardsstorage
for storage systemsaix
for IBM AIX systemshpux
for HP UX systemssolaris
for Oracle Solaris systemstru64
for HP Tru64 systemsvms
for HP Open VMS systems Refer to Monitored Systems for more details.
-
<protocol-configuration>
is the protocol(s) Hardware Sentry OpenTelemetry Collector will use to communicate with the hosts:http
,ipmi
,oscommand
,ssh
,snmp
,wmi
,wbem
orwinrm
. Refer to Protocols and credentials for more details.
Protocols and credentials
HTTP
Use the parameters below to configure the HTTP protocol:
Parameter | Description |
---|---|
http | Protocol used to access the host. |
port | The HTTPS port number used to perform HTTP requests (Default: 443). |
username | Name used to establish the connection with the host via the HTTP protocol. |
password | Password used to establish the connection with the host via the HTTP protocol. |
Example
hosts:
- host:
hostname: myhost-01
type: storage
http:
https: true
port: 443
username: myusername
password: mypwd
IPMI
Use the parameters below to configure the IPMI protocol:
Parameter | Description |
---|---|
ipmi | Protocol used to access the host. |
username | Name used to establish the connection with the host via the IPMI protocol. |
password | Password used to establish the connection with the host via the IPMI protocol. |
Example
hosts:
- host:
hostname: myhost-01
type: oob
ipmi:
username: myusername
password: mypwd
OS commands
Use the parameters below to configure OS Commands that are executed locally:
Parameter | Description |
---|---|
osCommand | Protocol used to access the host. |
timeout | How long until the local OS Commands time out (Default: 120s). |
useSudo | Whether sudo is used or not for the local OS Command: true or false (Default: false). |
useSudoCommands | List of commands for which sudo is required. |
sudoCommand | Sudo command to be used (Default: sudo). |
Example
hosts:
- host:
hostname: myhost-01
type: linux
osCommand:
timeout: 120
useSudo: true
useSudoCommands: [ cmd1, cmd2 ]
sudoCommand: sudo
SSH
Use the parameters below to configure the SSH protocol:
Parameter | Description |
---|---|
ssh | Protocol used to access the host. |
timeout | How long until the command times out (Default: 120s). |
useSudo | Whether sudo is used or not for the SSH Command (true or false). |
useSudoCommands | List of commands for which sudo is required. |
sudoCommand | Sudo command to be used (Default: sudo). |
username | Name to use for performing the SSH query. |
password | Password to use for performing the SSH query. |
privateKey | Private Key File to use to establish the connection to the host through the SSH protocol. |
Example
hosts:
- host:
hostname: myhost-01
type: linux
ssh:
timeout: 120
useSudo: true
useSudoCommands: [ cmd1, cmd2 ]
sudoCommand: sudo
username: myusername
password: mypwd
privateKey: /tmp/ssh-key.txt
SNMP
Use the parameters below to configure the SNMP protocol:
Parameter | Description |
---|---|
snmp | Protocol used to access the host. |
version | The version of the SNMP protocol (v1, v2c, v3-no-auth, v3-md5, v3-sha). |
community | The SNMP Community string to use to perform SNMP v1 queries (Default: public). |
port | The SNMP port number used to perform SNMP queries (Default: 161). |
timeout | How long until the SNMP request times out (Default: 120s). |
privacy | SNMP v3 only - The type of encryption protocol (none, aes, des). |
privacy password | SNMP v3 only - Password associated to the privacy protocol. |
username | SNMP v3 only - Name to use for performing the SNMP query. |
password | SNMP v3 only - Password to use for performing the SNMP query. |
Example
hosts:
- host:
hostname: myhost-01
type: linux
snmp:
version: v1
community: public
port: 161
timeout: 120s
- host:
hostname: myhost-01
type: linux
snmp:
version: v2c
community: public
port: 161
timeout: 120s
- host:
hostname: myhost-01
type: linux
snmp:
version: v3-md5
community: public
port: 161
timeout: 120s
privacy: des
privacyPassword: myprivacypwd
username: myusername
password: mypwd
WBEM
Use the parameters below to configure the WBEM protocol:
Parameter | Description |
---|---|
wbem | Protocol used to access the host. |
protocol | The protocol used to access the host. |
port | The HTTPS port number used to perform WBEM queries (Default: 5989 for HTTPS or 5988 for HTTP). |
timeout | How long until the WBEM request times out (Default: 120s). |
username | Name used to establish the connection with the host via the WBEM protocol. |
password | Password used to establish the connection with the host via the WBEM protocol. |
Example
hosts:
- host:
hostname: myhost-01
type: storage
wbem:
protocol: https
port: 5989
timeout: 120s
username: myusername
password: mypwd
WMI
Use the parameters below to configure the WMI protocol:
Parameter | Description |
---|---|
wmi | Protocol used to access the host. |
timeout | How long until the WMI request times out (Default: 120s). |
username | Name used to establish the connection with the host via the WMI protocol. |
password | Password used to establish the connection with the host via the WMI protocol. |
Example
hosts:
- host:
hostname: myhost-01
type: win
wmi:
timeout: 120s
username: myusername
password: mypwd
WinRM
Use the parameters below to configure the WinRM protocol:
Parameter | Description |
---|---|
winrm | Protocol used to access the host. |
timeout | How long until the WinRM request times out (Default: 120s). |
username | Name used to establish the connection with the host via the WinRM protocol. |
password | Password used to establish the connection with the host via the WinRM protocol. |
protocol | The protocol used to access the host: HTTP or HTTPS (Default: HTTP). |
port | The port number used to perform WQL queries and commands (Default: 5985 for HTTP or 5986 for HTTPS). |
authentications | Ordered list of authentication schemes: NTLM, KERBEROS (Default: NTLM). |
Example
hosts:
- host:
hostname: server-11
type: win
winrm:
protocol: http
port: 5985
username: myusername
password: mypwd
timeout: 120s
authentications: [ntml]
Additional settings (Optional)
Alert Settings
Disabling Alerts (Not Recommended)
To disable Hardware Sentry OpenTelemetry Collector's alerts:
-
for all your hosts, set the
disableAlerts
parameter totrue
just before thehosts
section:disableAlerts: true hosts: # ...
-
for a specific host, set the
disableAlerts
parameter totrue
in the relevanthost
section:hosts: - host: hostname: myhost type: linux snmp: version: v1 community: public port: 161 timeout: 120s disableAlerts: true
Hardware Problem template
When detecting a hardware problem, Hardware Sentry OpenTelemetry Collector triggers an alert as OpenTelemetry log. The alert body is built from the following template:
Hardware problem on ${FQDN} with ${MONITOR_NAME}.${NEWLINE}${NEWLINE}${ALERT_DETAILS}${NEWLINE}${NEWLINE}${FULLREPORT}
To change this default hardware problem template:
-
for all your hosts, configure the
hardwareProblemTemplate
parameter just before thehosts
section:hardwareProblemTemplate: Custom hardware problem on ${FQDN} with ${MONITOR_NAME}. hosts: # ...
-
for a specific host, configure the
hardwareProblemTemplate
parameter in the relevanthost
section:hosts: - host: hostname: myhost type: linux snmp: version: v1 community: public port: 161 timeout: 120s hardwareProblemTemplate: Custom hardware problem on myhost with ${MONITOR_NAME}.
and indicate the template to use when building alert messages.
For more information about the alert mechanism and the macros to use, refer to the Alerts page.
Authentication Settings
Basic authentication header
The Hardware Sentry OpenTelemetry Collector's internal OTLP Exporter
authenticates itself with the OTLP gRPC Receiver by including the HTTP Authorization
request header with the credentials. A predefined Basic Authentication Header value is stored internally and included in each request when sending telemetry data.
To override the default value of the Basic Authentication Header, add a new Authorization
header under the exporter:otlp:headers
section:
exporter:
otlp:
headers:
Authorization: Basic <credentials>
hosts: # ...
where <credentials>
are built by first joining your username and password with a colon (myUsername:myPassword
) and then encoding the value in base64
.
For more security, encrypt the Basic <credentials>
value. See Encrypting Passwords for more details.
Warning: If you update the Basic Authentication Header, you must generate a new
.htpasswd
file for the OpenTelemetry Collector Basic Authenticator.
Monitoring Settings
Collect period
By default, Hardware Sentry OpenTelemetry Collector collects metrics from the monitored hosts every minute. To change the default collect period:
-
for all your hosts, add the
collectPeriod
parameter just before thehosts
section:collectPeriod: 2m hosts: # ...
-
for a specific host, add the
collectPeriod
parameter in the relevanthost
section:hosts: - host: hostname: myhost type: linux snmp: version: v1 community: public port: 161 timeout: 120s collectPeriod: 1m30s # Customized
Warning: Collecting metrics too frequently can cause CPU-intensive workloads.
Connectors
The Hardware Sentry OpenTelemetry Collector comes with the Hardware Connector Library, a library that consists of hundreds of hardware connectors that describe how to discover hardware components and detect failures. When running Hardware Sentry OpenTelemetry Collector, the connectors are automatically selected based on the device type provided and the enabled protocols. You can however indicate to Hardware Sentry OpenTelemetry Collector which connectors should be used or excluded.
Use the parameters below to select or exclude connectors:
Parameter | Description |
---|---|
selectedConnectors | Connector(s) to use to monitor the host. No automatic detection will be performed. |
excludedConnectors | Connector(s) that must be excluded from the automatic detection. |
Connector names must be comma-separated, as shown in the example below:
hosts:
- host:
hostname: myhost-01
type: win
wmi:
timeout: 120s
username: myusername
password: mypwd
selectedConnectors: [ VMwareESX4i, VMwareESXi ]
excludedConnectors: [ VMwareESXiDisksStorage ]
Note: Any mispelled connector will be ignored.
To know which connectors are available, refer to Monitored Systems or run the below command:
$ hws -l
For more information about the hws
command, refer to Hardware Sentry CLI (hws)
Discovery cycle
Hardware Sentry OpenTelemetry Collector periodically performs discoveries to detect new components in your monitored environment. By default, Hardware Sentry OpenTelemetry Collector runs a discovery after 30 collects. To change this default discovery cycle:
-
for all your hosts, add the
discoveryCycle
just before thehosts
section:discoveryCycle: 15 hosts: # ...
-
for a specific host, add the
discoveryCycle
parameter in the relevanthost
section:hosts: - host: hostname: myhost type: linux snmp: version: v1 community: public port: 161 timeout: 120s discoveryCycle: 5 # Customized
and indicate the number of collects after which a discovery will be performed.
Warning: Running discoveries too frequently can cause CPU-intensive workloads.
Extra labels
Add labels in the extraLabels
section to override the data collected by the Hardware Sentry Agent or add additional attributes to the Host Resource. These attributes are added to each metric of that Resource when exported to time series platforms like Prometheus.
In the example below, we override the host.name
attribute resolved by Hardware Sentry OpenTelemetry Collector with host01.internal.domain.net
and indicate that it is the Jenkins
app:
hosts:
- host:
hostname: host01
type: Linux
snmp:
version: v1
port: 161
timeout: 120
extraLabels:
host.name: host01.internal.domain.net
app: Jenkins
Hostname resolution
By default, Hardware Sentry OpenTelemetry Collector resolves the hostname
of the host to a Fully Qualified Domain Name (FQDN) and displays this value in the Host Resource attribute host.name
. To display the configured hostname instead, set resolveHostnameToFqdn
to false
:
resolveHostnameToFqdn: false
hosts:
- host:
hostname: host01
type: Linux
Job pool size
By default, Hardware Sentry OpenTelemetry Collector runs up to 20 discovery and collect jobs in parallel. To increase or decrease the number of jobs Hardware Sentry OpenTelemetry Collector can run simultaneously, add the jobPoolSize
parameter just before the hosts
section:
jobPoolSize: 20
hosts: # ...
and indicate a number of jobs.
Warning: Running too many jobs in parallel can lead to an OutOfMemory error.
Sequential mode
By default, Hardware Sentry OpenTelemetry Collector sends the queries to the host in parallel. Although the parallel mode is faster than the sequential one, too many requests at the same time can lead to the failure of the targeted system.
To force all the network calls to be executed in sequential order:
-
for all your hosts, enable the
sequential
option just before thehosts
section (NOT RECOMMENDED):sequential: true hosts: # ...
-
for a specific host, enable the
sequential
option in the relevanthost
section:hosts: - host: hostname: myhost type: linux snmp: version: v1 community: public port: 161 timeout: 120s sequential: true # Customized
Warning: Sending requests in sequential mode slows down the monitoring significantly. Instead of using the sequential mode, you could increase the maximum number of allowed concurrent requests in the monitored system, if the manufacturer allows it.
Timeout, Duration and Period Format
Timeouts, durations and periods are specified with the below format:
Unit | Description | Examples |
---|---|---|
s | seconds | 120s |
m | minutes | 90m, 1m15s |
h | hours | 1h, 1h30m |
d | days (based on a 24-hour day) | 1d |
Security settings
Trusted certificates file
A TLS handshake takes place when the Hardware Sentry Agent's OTLP Exporter
instantiates a communication with the OTLP gRPC Receiver
. By default, the internal OTLP Exporter
client is configured to trust the OTLP gRPC Receiver
's certificate security/otel.crt
.
If you generate a new server's certificate for the OTLP gRPC Receiver, you must configure the trustedCertificatesFile
parameter under the exporter:otlp
section:
exporter:
otlp:
trustedCertificatesFile: security/new-server-cert.crt
hosts: # ...
The file should be stored in the security
folder and should contain one or more X.509 certificates in PEM format.