• May 5, 2023

Nagios Log Monitoring – Monitor log files on Unix efficiently

Nagios Log File Monitoring – Monitoring log files using Nagios can be as difficult as it is with any other monitoring application. However, with Nagios, once you have a script or log monitoring tool that can monitor a specific log file the way you want to monitor it, Nagios can be trusted to handle the rest. This kind of versatility is what makes Nagios one of the most popular and easy-to-use monitoring applications out there. It can be used to monitor anything effectively. I personally love it. It has no equal!

My name is Jacob Bowman and I work as a Nagios Monitoring Specialist. I’ve found, given the number of requests I get at work to monitor log files, that log file monitoring is a big problem. IT departments have an ongoing need to monitor their UNIX log files to ensure application or system problems can be detected early. When problems are known, unplanned outages can be completely avoided.

But the common question that many ask is, what monitoring application is available that can effectively monitor a log file? The simple answer to this question is NONE! The existing log monitoring applications require too much configuration, effectively making them unworthy of consideration.

Log monitoring should allow pluggable arguments on the command line (rather than in separate configuration files) and should be very easy for the average UNIX user to understand and use. Most log monitoring tools are not like this. They are often complex and require time to become familiar with (by reading endless pages of installation setups). In my opinion, this is an unnecessary problem that can and should be avoided.

Again, I am a firm believer that to be efficient one should be able to run a program directly from the command line without needing to go elsewhere to edit the configuration files.

So the best solution, in most cases, is to write a log monitoring tool for your particular needs or download a log monitoring program that has already been written for your type of UNIX environment.

Once you have that log monitoring tool, you can give it to Nagios to run at any time, and Nagios will schedule it to start at regular intervals. If after running it at set intervals, Nagios finds the issues/patterns/chains that you tell it to watch for, it will alert and send notifications to whoever you want them to.

But then you wonder, what kind of registry monitoring tool should you write or download for your environment?

The log monitoring program you should get to monitor your production log files should be as simple as the following, but still be powerfully versatile:

Example: logrobot /var/log/messages 60 ‘error’ ‘panic’ 5 10 -foundn

Departure: 2—1380—352—ATWF—(Tuesday/1)-(16:15)—(Tuesday/1)-(17:15:00)

Explanation:

The “-found” option searches /var/log/messages for the strings “error” and “panic”. Once found, it will abort with a 0 (for OK), 1 (for WARNING), or 2 (for CRITICAL). Each time you run that command, it will provide a one-line statistical report similar to the output above. The fields are delimited by “—“.

The first field is 2 = which means this is critical.

The second field is 1380 = number of seconds since the strings you specified last occurred in the log.

The third field is 352 = 352 occurrences of the string “error” and “panic” were found in the log in the last 60 minutes.

The fourth field is ATWF = Don’t worry about this for now. Irrelevant.

Means of fields 5 and 6 = The log file was searched from (Tuesday/1)-(16:15) to (Tuesday/1)-(17:15:00). And from the data collected from that time period, 352 “error” and “panic” occurrences were found.

If you really want to see all 352 occurrences, you can run the following command and pass the “-show” option to the logrobot tool. This will display on the screen all matching lines in the log that contain the strings you specified and that were written to the log in the last 60 minutes.

Example: logrobot /var/log/messages 60 ‘error’ ‘panic’ 5 10 -show

The “-show” command will display on the screen all the lines it finds in the log file that contain the strings “error” and “panic” within the last 60 minutes you specified. Of course, you can always change the parameters to suit your particular needs.

With this Nagios log monitoring tool (logrobot), you can perform the magic that famous big name monitoring applications cannot perform.

Once you write or download a script or log monitoring tool like the one above, you can have Nagios or CRON run it regularly, which in turn will allow you to keep a bird’s eye view of all your servers’ logged activities. important.

Do you have to use Nagios to run it regularly? Absolutely not. You can use whatever you want.

Leave a Reply

Your email address will not be published. Required fields are marked *