官术网_书友最值得收藏!

  • Learning NAGIOS 3.0
  • Wojciech Kocjan
  • 415字
  • 2021-08-25 18:05:38

S oft and Hard States

Nagios works by checking if a particular host or service is working correctly and storing its status. Because the status of a service is only one of the four possible values, it is crucial that it actually reflects what the current status is. In order to avoid detecting random and temporary problems, Nagios uses soft and hard states to describe what the current status of a host or service is.

Imagine that an administrator is restarting a web server and this operation makes connection to the web pages unavailable for five seconds. As, usually, such restarts are done at night to lower the number of users affected, this is an acceptable period of time. However, a problem might arise when Nagios tries to connect to the server and notices that it is actually down. If it relies only on a single result, Nagios would trigger an alert that a web server is down. It would actually be up and running again in a few seconds, but it could take a couple of minutes for Nagios to find that out.

To handle situations when a service is down for a very short time, or the test has temporarily failed, soft states were introduced. When the status of a check is unknown, or it is different from the previous one, Nagios will retest the host or service several times to make sure that the change is persistent. The number of checks is specified in the host or service configuration. Nagios assumes that the new result is a soft state. After additional tests have verified that the new state is permanent, it is considered a hard state.

Each host and service definition specifies the number of retries to be performed before it can be assumed that a change is permanent. This allows more flexibility over how many failures should be treated as an actual problem instead of a temporary one. Setting the number of checks to one will cause all changes to be treated as hard instantly. The following is an illustration of soft and hard state changes, assuming that number of checks to be performed is set to three:

Sfeatures, Nagios Nagiosfeaturesoft and Hard States

This feature allows ignoring short outages of a service. It is also very useful for performing checks that can periodically fail even if everything is working correctly. Monitoring devices over SNMP is also an example where a single check might fail, but the check will eventually succeed during the second or third check.

主站蜘蛛池模板: 神农架林区| 焦作市| 靖远县| 石楼县| 商南县| 庄河市| 资源县| 改则县| 出国| 安丘市| 寻甸| 博爱县| 东莞市| 三都| 阿拉善左旗| 恭城| 玉山县| 衡东县| 汝州市| 炉霍县| 中卫市| 同仁县| 璧山县| 腾冲县| 施秉县| 汝州市| 调兵山市| 天台县| 积石山| 建始县| 双城市| 开化县| 正宁县| 赫章县| 广昌县| 洛川县| 东明县| 湘阴县| 报价| 老河口市| 长武县|