- Learning NAGIOS 3.0
- Wojciech Kocjan
- 423字
- 2021-08-25 18:05:37
Chapter 1. Introduction
Imagine you're working as an administrator of a large IT infrastructure. You have just started receiving emails that a web application has stopped working. When you try to access the same page, it just doesn't load. What are the possibilities? Is it the router? Or the firewall? Perhaps the machine hosting the page is down? Before you even start thinking rationally about what is to be done, your boss calls about the critical situation and demands an explanation. In this panic situation, you'll probably start plugging everything in and out of the network, rebooting the machine and so on, and that doesn't help.
After hours of nervously digging into the issue you finally find the solution— the web server was working properly, but was timing out on communication with the database server. This was because the machine with the database was not getting a correct IP as yet another box had run out of memory and Dynamic Host Configuration Protocol (DHCP) server had stopped working. Imagine how much time it would take to find all that out manually. It would be a nightmare if the database server was in another branch of the company, in a different time zone, and perhaps the people over there were still sleeping.
And what if you had Nagios up and running across your entire company? You would just need to go to the web interface, see that there are no problems with the web server and the machine it is running on. There would also be a list of what's wrong – that the machine serving IP addresses to the entire company is not doing its job and that the database is down. If the set-up also monitored the DHCP server, you would get a warning email that very little swap memory is available on it, or that too many processes are running. Maybe it would even have an event handler for such cases to just kill or restart noncritical processes. Also, Nagios would try to restart the DHCP server process over the network, in case it is down.
In the worst case, Nagios would speed up hours of investigation to 10 minutes. In the best case, you would just get an email that there was a problem, followed by another one saying that the problem is already fixed. You would just disable a few services and increase the swap size for the DHCP machine and solve the problem once for all. And nobody would even notice there was a problem.
- AJAX and PHP: Building Modern Web Applications 2nd Edition
- 中文版CorelDRAW X7基礎培訓教程(移動學習版)
- CoffeeScript Application Development
- 計算機·手機生活應用
- Joomla! Social Networking with JomSocial
- Premiere視頻編輯應用教程:PremierePro 2020(微課版)
- AutoCAD 2019中文版計算機輔助繪圖全攻略
- SolidWorks 2018有限元:運動仿真與流場分析自學手冊
- Instant Testing with QUnit
- 從零開始:Illustrator CC中文版基礎培訓教程
- 剪映短視頻剪輯從入門到精通:宣傳短片+電商視頻+產品廣告+活動慶典
- TopSolid Wood軟件設計技術與應用
- 剪映:從零開始精通短視頻剪輯(電腦版)
- 中文版Maya 2016基礎培訓教程
- 攝影師的后期課:RAW格式技法篇