- display and update information about processes on a cluster of machines
[-m proc | load]
command can be used to remotely monitor a cluster of machines for CPU and
process information. It provides the same information as the top program,
but rather than showing it just for one machine, it gathers the information
from a cluster of machines, and sorts it all together into a single,
periodically updating report.
The following options are available:
to run one iteration on each node, and print out a single report of the
processes running on those nodes, rather than continuously updating.
option determines if the output is process output, or load average output.
Load average output gives the load average, and memory usage statistics for the
machines being monitored. Process output gives a list of processes on each
machine, sorted by cpu usage. The default mode is proc.
flag selects the interval of time between the update of each nodes process
or load information. It is not wise to set this value too low, otherwise
you may produce uneccesary load on the target hosts. The interval is specified
option is specified, followed by a number, it sets the fanout size of the
cluster. The fanout size is the number of nodes a command will run on in
parallel at one time. Thus a 80 node cluster, with a fanout size of 64,
would run 64 nodes in parallel, then, when all have finished, it would
execute the command on the last 16 nodes. The fanout size defaults to 64.
This option overrides the
option is specified, followed by a comma separated list of group names, the
command will only be run on that group of nodes. A node may be a part of
more than one group if desired, however running without the
option will run the command on the same node as many times as it appears in the
file specified by the
environment variable. This option is silently ignored if used with the
option is specified, followed by a username, the commands will be run under
that userid on the remote machines. Consideration must be taken for proper
authentication, for this to work.
option is used to set the timeout in seconds to be used when testing remote
connections. The default is five seconds.
option can be used to set the port number that testing should occur on when
testing remote connections. The default behavior is to guess based on the
remote command name.
option does not issue any commands, but displays information about the
cluster, and the fanout groupings.
to attempt a connection test to each node prior to attempting to run the
remote command. If the test fails for any reason, the remote command will
not be attempted. This can be useful when clusterfiles have suffered bitrot
and some nodes no longer exist, or might be down for maintenance. The default
timeout is 5 seconds. The timeout can be changed with the
will attempt to guess the port number of the remote service based on your
setting. It knows about ssh and rsh. If
fails to guess your port correctly, you may use the
argument to set the remote port number. If the
environment variable exists, the testing will automatically take place.
Prints the version of ClusterIt to the stdout, and exits.
option is specified, followed by a comma delimited list of machine names,
will be run on each node in the list. Without this option,
runs on the nodes listed in the file pointed to by the
option can be used to exclude specific nodes from the cluster. The format
is the same as the
option, a comma delimited list of machine names. This option is silently
ignored if used with the
utilizes the following environment variables.
Contains a filename, which is a newline separated list of nodes
in the cluster.
Command to use to connect to remote machines. The command chosen must
be able to connect with no password to the remote host. Defaults to
Arguments to pass to the remote shell command. Defaults to none.
The port number used to test remote connections. See the
will automatically test all hosts before launching the remote command. See the
option for more information.
The timeout in seconds to use when testing for remote connections.
The username to connect to remote machines as by default.
When set, limits the maximum number of concurrent commands sent at once.
This can be used to keep from overloading a small host when sending out
commands in parallel. Defaults to 64. This environment setting can be
overridden by the
is running in interactive mode, it reads commands from the terminal and acts
upon them accordingly. During interactive mode, every few seconds, depending
on the interval,
will query the next few hosts in the cluster, and merge the data from those
hosts into the display. The number of hosts updated each interval, is
determined by the fanout setting.
Certain characters cause immediate action by
Switch the mode to the process mode, sorted by the CPU usage of each process.
Switch the mode to the process mode, sorted by the memory usage of each
Switch the mode to the load average mode, sorted by hostname.
Switch the mode to the load average mode, sorted by load average.
Switch the mode to the load average mode, sorted by active memory.
Switch the mode to the load average mode, sorted by inactive memory.
Switch the mode to the load average mode, sorted by file cache/buffer memory.
Switch the mode to the load average mode, sorted by free memory.
Switch the mode to the load average mode, sorted by swap used.
Display the interactive help menu.
The file pointed to by the
environment variable has the following format:
This example would have pollux and castor a member of no groups, rigel and
kent a member of group 'alpha', and alshain and altair a member of group
Note the format of the GROUP command, it is in all capital letters, followed
by a colon, and the group name. There can be no spaces following the GROUP
command, or in the name of the group.
There is also a LUMP command, which is identical in syntax to the GROUP
command. This command allows you to create a named group of groups. Each
member of the lump is the name of a group. The LUMP command is terminated
by another LUMP or GROUP command, or the EOF marker.
Any line beginning with a
symbol denotes a comment field, and the entire line will be ignored.
Note that a hash mark placed anywhere other than the first character
of a line, will be considered part of a valid hostname or command.
command appeared in clusterit 2.5.
was made possible by a generous donation from Mach1 Computing, LLC.
was written by Tim Rightnour.
Solaris 2.5.1 has a maximum of 256 open file descriptors. This means
will fail on a fanout size greater than about 32-40 nodes.
uses the top command in batch mode to collect data from remote machines.
Because of this, the top command must exist on the remote node, and
must understand it's output.
should be able to understand output from top on NetBSD, Solaris, and Linux,
however, it is possible that if the format were to change, or be different,
it would break. If
fails to work for you, please send the output of:
top -Sb 20
top -bn 1
to email@example.com, or file a bug report on sourceforge.
is still rather new, and is likely to still have a few display bugs and