NagiosĀ® is a host and service monitor designed to inform you of network problems before your clients, end-users or managers do. It has been designed to run under the Linux OperatingSystem, but works fine under most Unix variants as well. The monitoring daemon runs intermittent checks on hosts and services you specify using external "plugins" which return status information to Nagios. When problems are encountered, the daemon can send notifications out to administrative contacts in a variety of different ways (Email, IM, SMS, etc.). Current status information, historical logs, and reports can all be accessed via a WebBrowser.

It used to go by the name NetSaint. See NagiosNotes for helpful tips.

  • Monitoring of network services (SMTP, POP3, HTTP, NNTP, Ping, etc.)
  • Monitoring of host resources (CPU load, disk and memory usage, running processes, log files, etc.)
  • Simple plugin design that allows users to easily develop their own host and service checks
  • Ability to define network host hierarchy, allowing detection of and distinction between hosts that are down and those that are unreachable
  • Contact notifications when service or host problems occur and get resolved (via Email, pager, or other user-defined method)
  • Optional escalation of host and service notifications to different contact groups
  • Ability to define event handlers to be run during service or host events for proactive problem resolution
  • Support for implementing redundant and distributed monitoring servers
  • External command interface that allows on-the-fly modifications to be made to the monitoring and notification behavior through the use of event handlers, the web interface, and third-party applications
  • Retention of host and service status across program restarts
  • Scheduled downtime for supressing host and service notifications during periods of planned outages
  • Ability to acknowlege problems via the web interface
  • Web interface for viewing current network status, notification and problem history, log file, etc.
  • Simple authorization scheme that allows you restrict what users can see and do from the web interface
  • Generates pretty network status maps

Nagios has many features that NetSaint lacks but the most important one is that it allows you to configure it using template based configuration files, This simplifies the configuration hugely and reduces the chance of mistakes. Your config for a server goes from this:

host[gatekeeper]=Gatekeeper (Main Cloverly Server);;;check-host-alive;5;60;24x7;1;1;1;

to this:

define host{
  use                     generic-host            ; Name of host template to use
  host_name               gatekeeper
  alias                   Gatekeeper (Main Cloverly Server)
  max_check_attempts      3
  notification_interval   120
  notification_period     24x7
  notification_options    d,u,r

Some downsides that I have encountered so far:

  • Very basic authentication method -- Apache htaccess files, does not integrate well with existing authentication systems
  • The notifications take some setting up to ensure that you do not get flooded with Email.



Courtesy of FreshMeat. See also the official screenshots.

Cool things built on Nagios

GroundWork Open Source is a full web-controllable monitoring suite, built on top of Nagios.