Big Sister is using intelligent health checks to ensure system, application and content availability.
System availability can be ensured i.e. through remote ping checks on (one of) the system's NICs. Disk capacity, errors in syslog and other parameters can be made available for futher procedssing through agent software (uxmon) installed on the monitored system.
Application availability can be ensured through an agent on the system which is running the application or, if it is a network application through binding to the tcp- service port. Full path healthchecks which go beyond checks of the tcp- service port exist for some network applications i.e. mail-, web- or DNS-servers.
Content availability might be desired especially on webservers. This special healthcheck tests if a specified URL is available.
uxmon-net entries may span multiple lines. Usually a line end will automatically end the respective configuration entry. However if a line ends with a \ character the following line is assumed to be part of that entry also. If a '#' character is found all the characters behind # are treated as a comment.
Most of the checks accept arguments. Arguments are always preceeding the check and are of the form
argument1=value1 argument2=value2 ... check |
The argument list only applies to the check immediately following the last argument in the list, thus
myhost proto=icmp ping ping |
will run two ping checks against the host myhost, the first one will do an ICMP ping, while the second will do a ping using the default protocol (usually UDP). The proto argument does not influence the second ping check, but of course you can do something (rather senseless) like
myhost proto=icmp ping proto=icmp ping |
You will find a complete and hopefully up to date list of available health checks with their arguments in the reference part of this documentation!
Two special pseudo-checks in the uxmon-net file point your agent to the server(s) that status reports should be sent to: the bsdisplay and bbdisplay checks. The line
myserver bsdisplay |
for instance will force the agent to send status information to the server myserver where myserver talks the Big Sister protocol. You can use uxmon in conjunction with a Big Brother server by changing the above line into
myserver bbdisplay |
In this case the agent will suppress any non Big-Brother feature and use the Big Brother protocol to talk to the server (see figure 2.1)
Of course multiple bsdisplay / bbdisplaylines may appear in uxmon-net. In this case the agent will report its status information multiple times to (potentially) different servers.
As one would expect the bsdisplay pseudo-check accepts a few arguments, e.g. the line
myserver fqdn=no bsdisplay |
will report status information to myserver by stripping domains from all the host names.
Other useful arguments which are relevant if you want Big Sister to keep statistics on performance data and are listed in section ??.
By default uxmon runs checks and reports information in 5 minutes intervals. Any-way, some checks might put some load on the target system, or are of no short-term relevance and you prefer to run them less frequently. Other checks might be of extreme importance and should be run more often. For such occasions you can define your own check frequencies. Every check (including the bsdisplay and bbdisplay pseudo-checks) accept the special argument frequency. E.g. the uxmon-net line
localhost frequency=180 metastat |
importantmachine frequency=1 ping |
will run the metastat check against the local machine every 180 minutes while the ping check is run every minute.
![]() | Frequencies |
---|---|
Note that the argument name frequency is a little bit misleading - it is not really a frequency but rather the time interval in minutes between two runs of a check. |
When defining check frequencies keep some rules in mind:
Running a check more often than the fastest bsdisplay pseudo-check is senseless. Check results will only be reported during bsdisplay runs, not necessarily immediately after each check s run. So the above example only makes sense if you have got for instance the following uxmon-net:
localhost frequency=180 metastat importantmachine frequency=1 ping myserver frequency=1 bsdisplay |
You must run bsdisplay checks at least every 10 minutes. The server relies on the agents to report their status rather frequently. If no status information is coming in for 15 minutes the server will assume the agent or communication to the agent is dead and will set status to purple (no report).
The above rule does not apply to other checks. Even if you run certain checks only every few hours the bsdisplay check will report the result of the last check run. So the server will not change status to purple.
Proceeding to more complex uxmon-net files you will probably get bored by repeatedly listing the same check arguments again and again. Fortunately uxmon supports setting defaults for certain arguments. For instance the configuration:
localhost frequency=2 type=ext2 diskfree localhost frequency=2 memory localhost frequency=2 procs=sendmail procs fileserver frequency=5 nfs myserver frequency=2 bsdisplay |
can be simplified to
DEFAULT frequency=2 ALL DEFAULT type=ext2 diskfree localhost diskfree localhost memory procs=sendmail procs fileserver frequency=5 nfs myserver bsdisplay |
the DEFAULT statements in this example do
set the default check interval for all (ALL) the checks to 2 minutes
set the default type for all diskfree checks to ext2
Of course you can override defaults by just listing an argument with a non-default value as usual. For instance in the example above the interval for the nfs check is explicitly set to 5 minutes overriding the default interval of 2 minutes.