首页 Linux正文

Nagios深入配置

王昊 Linux 2019-12-05 345 0

Nrpe监控端:192.168.17.195

Nrpe客户端:192.168.17.197

Nrpe服务器上添加一个客户端监控

cd /usr/local/nagios/etc/objects/

cp localhost.cfg 192.168.17.197.cfg(此处IP为要监控的客户端的IP)

把默认配置文件里面的 locahost、127.0.0.1、check_local 按实际情况进行替换

sed -i “s#localhost#192.168.17.197#g;s#127.0.0.1#192.168.17.197#g;s#check_local#check#g;s#linux-servers#192.168.17.197#g” 192.168.17.197.cfg

在 nagios.cfg 36 行后加入 cfg_file=/usr/local/nagios/etc/objects/192.168.17.197.cfg

sed -i “36a cfg_file=/usr/local/nagios/etc/objects/192.168.17.197.cfg” /usr/local/nagios/etc/nagios.cfg

查看配置是否有错误

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Error: Service check command 'check_load' specified in service 'Current Load' for host '192.168.17.197' not defined anywhere!
Error: Service check command 'check_users' specified in service 'Current Users' for host '192.168.17.197' not defined anywhere!
Error: Service check command 'check_disk' specified in service 'Root Partition' for host '192.168.17.197' not defined anywhere!
Error: Service check command 'check_swap' specified in service 'Swap Usage' for host '192.168.17.197' not defined anywhere!
Error: Service check command 'check_procs' specified in service 'Total Processes' for host '192.168.17.197' not defined anywhere!
.....
Total Warnings: 0
Total Errors:   5

有错误信息,因为没有在客户端安装 nagios 插件及 NRPE,需删掉配置文件里 disk、Swap Usage、Processes、check_local_users、cpu 等监控配置段

vim 192.168.17.197.cfg

先做测试,所以define service暂时只留下三个:PING、SSH、HTTP

define service{
        use                             local-service         ; Name of service template to use
        host_name                       192.168.17.197
        service_description             PING
        check_command                   check_ping!100.0,20%!500.0,60%
        }



define service{
        use                             local-service         ; Name of service template to use
        host_name                       192.168.17.197
        service_description             SSH
        check_command                   check_ssh
        notifications_enabled           0
        }

define service{
        use                             local-service         ; Name of service template to use
        host_name                       192.168.17.197
        service_description             HTTP
        check_command                   check_http
        notifications_enabled           0
        }

重启nagios

/etc/init.d/nagios restart

刷新浏览器

Hosts:

在Nrpe监控端安装NRPE

yum -y install openssl-devel

cd /usr/local

wget http://prdownloads.sourceforge.net/sourceforge/nagios/nrpe-2.13.tar.gz

tar -xzf nrpe-2.13.tar.gz

cd nrpe-2.13

./configure –enable-ssl –with-ssl-lib && make all && make install-plugin && make install-daemon && make install-daemon-config

chown -R nagios:nagios /usr/local/nagios/

cd /usr/local/nrpe-2.13/sample-config/

cp nrpe.cfg /usr/local/nagios/etc/nrpe.cfg

vim nrpe.cfg

allowed_hosts=127.0.0.1,localhost

启动

/usr/local/nagios/bin/nrpe -c nrpe.cfg -d

ps -ef |grep nrpe

nagios    62650      1  0 10:42 ?        00:00:00 /usr/local/nagios/bin/nrpe -c nrpe.cfg -d
root      62660  43906  0 10:43 pts/0    00:00:00 grep --color=auto nrpe

查看nrpe的版本

/usr/local/nagios/libexec/check_nrpe -H localhost

NRPE v2.13

Nagios客户端安装需要安装两个软件:nagios-plugins 和 nrpe

安装 nagios-plugins插件

cd /usr/local/

wget http://nagios-plugins.org/download/nagios-plugins-2.2.1.tar.gz

useradd nagios

tar -xzf nagios-plugins-2.2.1.tar.gz

cd nagios-plugins-2.2.1

./configure –prefix=/usr/local/nagios && make && make install

插件安装完成后,再按照服务端的方式安装NRPE,这里就不写重复操作了

Nrpe 客户端配置

vim /usr/local/nagios/etc/nrpe.cfg

添加检测命令,-w 为警告,-c 为临界值,可以按需求修改其数值

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
------------
修改为
------------
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20 -c 10 -p /dev/sda2
command[check_procs]=/usr/local/nagios/libexec/check_procs -w 500 -c 550
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20 -c 10

启动

/usr/local/nagios/bin/nrpe -c nrpe.cfg -d

查看端口启动情况

netstat -tnl

tcp        0      0 0.0.0.0:5666            0.0.0.0:*               LISTEN

查看监控端与客户端的握手情况

/usr/local/nagios/libexec/check_nrpe -H 192.168.17.197

CHECK_NRPE: Error - Could not complete SSL handshake.

无法完成SSL握手

原因是Nrpe客户端不允许任何人访问

cd /usr/local/nagios/etc/

vim nrpe.cfg

allowed_hosts=127.0.0.1
------
在后面加上Nrpe监控端的地址
------
allowed_hosts=127.0.0.1,192.168.17.195

重新启动

ps -ef |grep nrpe

nagios    80828      1  0 Aug25 ?        00:00:09 /usr/local/nagios/bin/nrpe -c nrpe.cfg -d

kill -9 80828

/usr/local/nagios/bin/nrpe -c nrpe.cfg -d

Nrpe监控端

/usr/local/nagios/libexec/check_nrpe -H 192.168.17.197

NRPE v2.13

能显示版本说明跟客户端通讯成功

然后对监控端进行配置

定义 Nrpe 监控命令

cd /usr/local/nagios/etc/objects

vim commands.cfg

写入文件末尾即可

define command{
command_name check_nrpe         
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}

Nagios 监控端客户机配置

vim /usr/local/nagios/etc/objects/192.168.17.197.cfg

监控负载、磁盘、进程数、交换分区和用户

define service{
        use                              local-service
        host_name                        192.168.17.197
        service_description              Current Load   
        check_command                    check_nrpe!check_load
        }

define service{
        use                             local-service
        host_name                       192.168.17.197
        service_description             client-disk
        check_command                   check_nrpe!check_disk
        }

define service{
        use                             local-service
        host_name                       192.168.17.197
        service_description             Current procs
        check_command                   check_nrpe!check_procs  
        }

define service{
        use                             local-service
        host_name                       192.168.17.197
        service_description             Current swap
        check_command                   check_nrpe!check_swap
        }

define service{
        use                             local-service
        host_name                       192.168.17.197
        service_description             Login users
        check_command                   check_nrpe!check_users
        }

其他同理,添加的方法一样。只要在客户端 nrpe.cfg 里面添加的监控命令,都可以在监控端引用

查看写入的内容有没有错误

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Total Warnings: 0
Total Errors:   0

重启nagios

/etc/init.d/nagios restart

浏览器访问 http://监控端IP/nagios/

Services:

web界面开启服务通知的步骤(例如开启HTTP): 点击Service–>HTTP–>Enable notifications for this service–>Commit–>Done

要是在web界面开启服务通知时出现以下错误,则直接点击进入另一篇帖子,有解决方法

Error: Could not open command file ‘/usr/local/nagios/var/rw/nagios.cmd’ for update!

Nagios 监控端 HTTP 关键词

可以使用默认监控命令 check_http 命令+相关的参数来实现。在 command.cfg 添加如下关键词监控命令:check_http_word

参数解析:-I 指定IP 或者主机名,-u 指定 URL,-p 指定端口,-s 指定关键词

把命令添加到监控端的配置文件里引用即可

vim /usr/local/nagios/etc/objects/commands.cfg

define command{ 
        command_name    check_http_word 
        command_line    $USER1$/check_http -I $HOSTADDRESS$ -u $ARG1$ -p $$ARG2$ -s $ARG3$ 
        }

查看写入的内容有没有错误

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Total Warnings: 0
Total Errors:   0

在客户端的Apache的发布目录下新建一个测试页面

vim index.html

it works ok

回到监控端进行测试

/usr/local/nagios/libexec/check_http -I 192.168.17.197 -u /index.html -p 80 -s “works”

HTTP OK: HTTP/1.1 200 OK - 272 bytes in 0.002 second response time |time=0.001824s;;;0.000000 size=272B;;;0

若网页内容被篡改,则会显示

HTTP CRITICAL: HTTP/1.1 200 OK - string 'works' not found on 'http://192.168.17.197:80/index.html' - 272 bytes in 0.002 second response time |time=0.001997s;;;0.000000 size=272B;;;0

如何在web界面显示监控结果呢

vim /usr/local/nagios/etc/objects/192.168.17.197.cfg

define service{
        use                             local-service
        host_name                       192.168.17.197
        service_description             HTTPD-WORD-WORKS
        check_command                   check_http_word!'/index.html'!80!'works'
        }

/etc/init.d/nagios reload

浏览器刷新

Services:

Nagios 邮件及短信报警

先查看邮件服务的端口是否开启

netstat -ntl

tcp6       0      0 ::1:25                  :::*                    LISTEN 

安装mail服务

先尝试在本机发送

echo “it works ok” | mail -s “root” wh1647458225@163.com

在 nagios 服务器端配置文件修改邮件收件人

vim /usr/local/nagios/etc/objects/contacts.cfg

email                           wh1647458225@163.com

/etc/init.d/nagios reload

为了测试,把客户端网页的关键词改为wors

vim /index.html

it wors ok!

然后看web界面的警告(等待片刻,有检测时间)

然后邮件报警

恢复正常后也会有邮件提醒,这里就不做演示了。

版权声明

本文仅代表作者观点,不代表本站立场。
本文系作者授权发表,未经许可,不得转载。

评论