官术网_书友最值得收藏!

Resource Monitoring

For servers or workstations to be responsive and to be kept from being overloaded, it is also worth monitoring system usage using various additonal measures. Nagios offers several plugins to monitor resource usage and to report if the limits set for these checks are exceeded.

System Load

The first thing that should always be monitored is the system load. This value reflects the number of processes and the amount of CPU capacity that they are utilizing. This means that if one process is using up to 50% of the CPU capacity, the value will be around 0.5; and if four processes try to utilize the maximum CPU capacity, the value will be around 4.0. The system load is measured in three values—the average loads in the last minute, last 5 minutes, and the last 15 minutes. The syntax of the command is as follows:

check_swap [-r] –w wload1,wload5,wload15 –c cload1,cload5,cload15

Values for the -w and -c options should be in the form of three values separated by commas. If any of the load averages exceeds the specified limits, a warning, or critical status will be returned, respectively. Here is a sample command definition that uses warning and critical load limits as arguments:

  define command
  {
    command_name  check_load
    command_line  $USER1$/check_load –w $ARG1$ -c $ARG2$
  }

Checking Processes

Nagios also offers a way to monitor the total number of processes. Nagios can be configured to monitor all processes, only running ones, those consuming CPU, those consuming memory, or a combination of these criteria. The syntax and options are as follows:

check_procs -w <range> -c <range> [-m metric] [-s state]
            [-p ppid] [-u user] [-r rss] [-z vsz] [-P %cpu]
            [-a argument-array] [-C command] [-t timeout] [-v]

Values for the -w and -c options can either take a single value, or take the form of <min>:<max>. In the first case, a warning or critical state is returned if the value (number of processes by default) exceeds the specified number. In the second case, the appropriate status is returned if the value is lower than <min> or higher than <max>. Sample commands to monitor the total number of processes and to monitor the number of specific processes are as follows. The second code, for example, can be used to check to see if the specific server is running, and has not created too many processes. In this case, warning or critical values should be specified ranging from 1.

  define command
  {
    command_name  check_procs_num
    command_line  $USER1$/check_procs –m PROCS –w $ARG1$ -c $ARG2$
  }
  define command
  {
    command_name  check_procs_cmd
    command_line  $USER1$/check_procs –C $ARG1$ –w $ARG1$ -c $ARG2$ 
  }

Monitoring Logged-in Users

It is also possible to use Nagios to monitor the number of users currently logged in to a particular machine. The syntax is very simple and there are the no options, except for warning and critical limits.

check_users -w limit -c limit

A command definition that uses warning or critical limits specified in the arguments is as follows:

  define command
  {
    command_name  check_users
    command_line  $USER1$/check_users –w $ARG1$ -c $ARG2$
  }
主站蜘蛛池模板: 长乐市| 工布江达县| 望城县| 民乐县| 德江县| 唐山市| 伊吾县| 雷州市| 合山市| 黄冈市| 安仁县| 保山市| 枣阳市| 望江县| 广昌县| 太仓市| 柳州市| 梨树县| 班戈县| 水富县| 盐津县| 榆林市| 东兰县| 巴塘县| 武威市| 谢通门县| 丰宁| 克拉玛依市| 万盛区| 宁武县| 修水县| 浦县| 家居| 义乌市| 绿春县| 察雅县| 涞源县| 安康市| 商洛市| 雷波县| 玛多县|