NAME
qstat - show the status of Grid Engine jobs and queues
SYNTAX
qstat [ -ext ] [ -f ] [ -F [resource_name,...] ] [ -g d ] [
-help ] [ -j [job_list] ] [ -l resource=val,... ] [ -ne ] [
-pe pe_name,... ] [ -q queue,... ] [ -r ] [ -s
{r|p|s|z|hu|ho|hs|hj|ha|h}[+]] ] [ -t ] [ -U user,... ] [ -u
user,... ]
DESCRIPTION
qstat shows the current status of the available Grid Engine
queues and the jobs associated with the queues. Selection
options allow you to get information about specific jobs,
queues or users. Without any option qstat will display only
a list of jobs with no queue status information.
OPTIONS
-alarm
Displays the reason(s) for queue alarm states. Outputs
one line per reason containing the resource value and
threshold. For details about the resource value please
refer to the description of the Full Format in section
OUTPUT FORMATS below.
-ext This option is only supported in case of a Grid Engine
Enterprise Edition system. It is not available for Grid
Engine systems.
Displays additional Grid Engine Enterprise Edition
relevant information for each job (see OUTPUT FORMATS
below).
-f Specifies a "full" format display of information. The
-f option causes summary information on all queues to
be displayed along with the queued job list.
-F [ resource_name,... ]
Like in the case of -f information is displayed on all
jobs as well as queues. In addition, qstat will present
a detailed listing of the current resource availability
per queue with respect to all resources (if the option
argument is omitted) or with respect to those resources
contained in the resource_name list. Please refer to
the description of the Full Format in section OUTPUT
FORMATS below for further detail.
-g d Displays job arrays verbosely in a one line per job
task fashion. By default, job arrays are grouped and
all tasks with the same status (for pending tasks only)
are displayed in a single line. The job array task id
range field in the output (see section OUTPUT FORMATS)
specifies the corresponding set of tasks.
The -g switch currently has only the single option
argument d. Other option arguments are reserved for
future extensions.
-help
Prints a listing of all options.
-j [job_list]
Prints either for all pending jobs or the jobs con-
tained in job_list the reason for not being scheduled.
-l resource[=value],...
Defines the resources required by the jobs or granted
by the queues on which information is requested.
Matching is performed on queues. The pending jobs are
restricted to jobs that might run in one of the above
queues.
-ne In combination with -f the option suppresses the
display of empty queues. This means all queues where
actually no jobs are running are not displayed.
-pe pe_name,...
Displays status information with respect to queues
which are attached to at least one of the parallel
environments enlisted in the comma separated option
argument. Status information for jobs is displayed
either for those which execute in one of the selected
queues or which are pending and might get scheduled to
those queues in principle.
-q queue,...
Specifies the queue to which job information is to be
displayed.
-r Prints extended information about the resource require-
ments of the displayed jobs. Please refer to the OUTPUT
FORMATS sub-section Expanded Format below for detailed
information.
-s {p|r|s|z|hu|ho|hs|hj|ha|h}[+]
Prints only jobs in the specified state, any combina-
tion of states is possible. -s prs corresponds to the
regular qstat output without -s at all. To show
recently finished jobs, use -s z. To display jobs in
user/operator/system hold, use the -s hu/ho/hs option.
The -s ha option shows jobs which where submitted with
the qsub -a command. qstat -s hj displays all jobs
which are not eligible for execution unless the job has
entries in the job dependency list. (see -a and
-hold_jid option to qsub(1)).
-t Prints extended information about the controlled sub-
tasks of the displayed parallel jobs. Please refer to
the OUTPUT FORMATS sub-section Expanded Format below
for detailed information. Sub-tasks of parallel jobs
should not be confused with job array tasks (see -g
option above and -t option to qsub(1)).
-U user,...
Displays status information with respect to queues to
which the specified users have access. Status informa-
tion for jobs is displayed either for those which exe-
cute in one of the selected queues or which are pending
and might get scheduled to those queues in principle.
-u user,...
Display information only on those jobs and queues being
associated with the users from the given user list.
Queue status information is displayed if the -f or -F
options are specified additionally and if the user runs
jobs in those queues.
OUTPUT FORMATS
Depending on the presence or absence of the -alarm, -f or -F
and -r and -t option three output formats need to be dif-
ferentiated. PP In case of a Grid Engine Enterprise Edition
system, the -ext option may be used to display additional
information for each job.
Reduced Format (without -f and -F)
Following the header line a line is printed for each job
consisting of
o the job ID.
o the priority of the jobs as assigned to them via the -p
option to qsub(1) or qalter(1) determining the order of
the pending jobs list.
o the name of the job.
o the user name of the job owner.
o the status of the job - one of t(ransfering), r(unning),
R(estarted), s(uspended), S(uspended), T(hreshold),
w(aiting) or h(old).
The states t(ransfering) and r(unning) indicate that a
job is about to be executed or is already executing,
whereas the states s(uspended), S(uspended) and
T(hreshold) show that an already running jobs has been
suspended. The s(uspended) state is caused by suspending
the job via the qmod(1) command, the S(uspended) state
indicates that the queue containing the job is suspended
and therefore the job is also suspended and the
T(hreshold) state shows that at least one suspend thres-
hold of the corresponding queue was exceeded (see
queue_conf(5)) and that the job has been suspended as a
consequence. The state R(estarted) indicates that the job
was restarted. This can be caused by a job migration or
because of one of the reasons described in the -r section
of the qsub(1) command.
The states w(aiting) and h(old) only appear for pending
jobs. The h(old) state indicates that a job currently is
not eligible for execution due to a hold state assigned
to it via qhold(1), qalter(1) or the qsub(1) -h option or
that the job is waiting for completion of the jobs to
which job dependencies have been assigned to the job via
the -hold_jid option of qsub(1) or qalter(1).
o the submission or start time and date of the job.
o the queue the job is assigned to (for running or
suspended jobs only).
o the function of the running jobs (MASTER or SLAVE - the
latter for parallel jobs only).
o the job array task id. Will be empty for non-array jobs.
See the -t option to qsub(1) and the -g above for addi-
tional information.
If the -t option is supplied, each job status line also con-
tains
o the parallel task ID (do not confuse parallel tasks with
job array tasks),
o the status of the parallel task - one of r(unning),
R(estarted), s(uspended), S(uspended), T(hreshold),
w(aiting), h(old), or x(exited).
o the cpu, memory, and I/O usage (Grid Engine Enterprise
Edition only),
o the exit status of the parallel task,
o and the failure code and message for the parallel task.
Full Format (with -f and -F)
Following the header line a section for each queue separated
by a horizontal line is provided. For each queue the infor-
mation printed consists of
o the queue name,
o the queue type - one of B(atch), I(nteractive),
C(heckpointing), P(arallel), T(ransfer) or combinations
thereof,
o the number of used and available job slots,
o the load average of the queue host,
o the architecture of the queue host and
o the state of the queue - one of u(nknown) if the
corresponding sge_execd(8) cannot be contacted, a(larm),
A(larm), C(alendar suspended), s(uspended),
S(ubordinate), d(isabled), D(isabled), E(rror) or combi-
nations thereof.
If the state is a(larm) at least on of the load thresholds
defined in the load_thresholds list of the queue configura-
tion (see queue_conf(5)) is currently exceeded, which
prevents from scheduling further jobs to that queue.
As opposed to this, the state A(larm) indicates that at
least one of the suspend thresholds of the queue (see
queue_conf(5)) is currently exceeded. This will result in
jobs running in that queue being successively suspended
until no threshold is violated.
The states s(uspended) and d(isabled) can be assigned to
queues and released via the qmod(1) command. Suspending a
queue will cause all jobs executing in that queue to be
suspended.
The states D(isabled) and C(alendar suspended) indicate that
the queue has been disabled or suspended automatically via
the calendar facility of Grid Engine (see calendar_conf(5)),
while the S(ubordinate) state indicates, that the queue has
been suspend via subordination to another queue (see
queue_conf(5) for details). When suspending a queue (regard-
less of the cause) all jobs executing in that queue are
suspended too.
If an E(rror) state is displayed for a queue, sge_execd(8)
on that host was unable to locate the sge_shepherd(8) exe-
cutable on that host in order to start a job. Please check
the error logfile of that sge_execd(8) for leads on how to
resolve the problem. Please enable the queue afterwards via
the -c option of the qmod(1) command manually.
If the -F option was used, resource availability information
is printed following the queue status line. For each
resource (as selected in an option argument to -F or for all
resources if the option argument was omitted) a single line
is displayed with the following format:
o a one letter specifier indicating whether the current
resource availability value was dominated by either
`g' - a cluster global,
`h' - a host total or
`q' - a queue related resource consumption.
o a second one letter specifier indicating the source for
the current resource availability value, being one of
`l' - a load value reported for the resource,
`L' - a load value for the resource after administrator
defined load scaling has been applied,
`c' - availability derived from the consumable resources
facility (see complexes(5)),
`v' - a default complexes configuration value never
overwritten by a load report or a consumable update or
`f' - a fixed availability definition derived from a
non-consumable complex attribute or a fixed resource
limit.
o after a colon the name of the resource on which informa-
tion is displayed.
o after an equal sign the current resource availability
value.
The displayed availability values and the sources from which
they derive are always the minimum values of all possible
combinations. Hence, for example, a line of the form
"qf:h_vmem=4G" indicates that a queue currently has a max-
imum availability in virtual memory of 4 Gigabyte, where
this value is a fixed value (e.g. a resource limit in the
queue configuration) and it is queue dominated, i.e. the
host in total may have more virtual memory available than
this, but the queue doesn't allow for more. Contrarily a
line "hl:h_vmem=4G" would also indicate an upper bound of 4
Gigabyte virtual memory availability, but the limit would be
derived from a load value currently reported for the host.
So while the queue might allow for jobs with higher virtual
memory requirements, the host on which this particular queue
resides currently only has 4 Gigabyte available.
If the -alarm option was used, information about resources
is displayed, that violate load or suspend thresholds.
The same format as with the -F option is used with following
extensions:
o the line starts with the keyword `alarm'
o appended to the resource value is the type and value of
the appropriate threshold
After the queue status line (in case of -f) or the resource
availability information (in case of -F) a single line is
printed for each job running currently in this queue. Each
job status line contains
o the job ID,
o the job name,
o the job owner name,
o the status of the job - one of t(ransfering), r(unning),
R(estarted), s(uspended), S(uspended) or T(hreshold) (see
the Reduced Format section for detailed information),
o the start date and time and the function of the job (MAS-
TER or SLAVE - only meaningful in case of a parallel job)
and
o the priority of the jobs.
If the -t option is supplied, each job status line also con-
tains
o the task ID,
o the status of the task - one of r(unning), R(estarted),
s(uspended), S(uspended), T(hreshold), w(aiting), h(old),
or x(exited) (see the Reduced Format section for detailed
information),
o the cpu, memory, and I/O usage (Grid Engine Enterprise
Edition only),
o the exit status of the task,
o and the failure code and message for the task.
Following the list of queue sections a PENDING JOBS list may
be printed in case jobs are waiting for being assigned to a
queue. A status line for each waiting job is displayed
being similar to the one for the running jobs. The differ-
ences are that the status for the jobs is w(aiting) or
h(old), that the submit time and date is shown instead of
the start time and that no function is displayed for the
jobs.
In very rare cases, e.g. if sge_qmaster(8) starts up from an
inconsistent state in the job or queue spool files or if the
clean queue (-cq) option of qconf(1) is used, qstat cannot
assign jobs to either the running or pending jobs section of
the output. In this case as job status inconsistency (e.g. a
job has a running status but is not assigned to a queue) has
been detected. Such jobs are printed in an ERROR JOBS sec-
tion at the very end of the output. The ERROR JOBS section
should disappear upon restart of sge_qmaster(8). Please
contact your Grid Engine support representative if you feel
uncertain about the cause or effects of such jobs.
Expanded Format (with -r)
If the -r option was specified together with qstat, the fol-
lowing information for each displayed job is printed (a sin-
gle line for each of the following job characteristics):
o The hard and soft resource requirements of the job as
specified with the qsub(1) -l option.
o The requested parallel environment including the desired
queue slot range (see -pe option of qsub(1)).
o The requested checkpointing environment of the job (see
the qsub(1) -ckpt option).
o In case of running jobs, the granted parallel environment
with the granted number of queue slots.
Enhanced Grid Engine Enterprise Edition Output (with -ext)
For each job the following additional items are displayed:
project
The project to which the job is assigned as specified
in the qsub(1) -P option.
department
The department, to which the user belongs (use the -sul
and -su options of qconf(1) to display the current
department definitions).
deadline
The deadline initiation time of the job as specified
with the qsub(1) -dl option.
cpu The current accumulated CPU usage of the job.
mem The current accumulated memory usage of the job.
io The current accumulated IO usage of the job.
tckts
The total number of tickets assigned to the job
currently
ovrts
The override tickets as assigned by the -ot option of
qalter(1).
otckt
The override portion of the total number of tickets
assigned to the job currently
dtckt
The deadline portion of the total number of tickets
assigned to the job currently
ftckt
The functional portion of the total number of tickets
assigned to the job currently
stckt
The share portion of the total number of tickets
assigned to the job currently
share
The share of the total system to which the job is enti-
tled currently.
ENVIRONMENTAL VARIABLES
SGE_ROOT Specifies the location of the Grid Engine
standard configuration files.
SGE_CELL If set, specifies the default Grid Engine
cell. To address a Grid Engine cell qstat
uses (in the order of precedence):
The name of the cell specified in the
environment variable SGE_CELL, if it is
set.
The name of the default cell, i.e.
default.
SGE_DEBUG_LEVEL
If set, specifies that debug information
should be written to stderr. In addition the
level of detail in which debug information is
generated is defined.
COMMD_PORT If set, specifies the tcp port on which
sge_commd(8) is expected to listen for com-
munication requests. Most installations will
use a services map entry instead to define
that port.
COMMD_HOST If set, specifies the host on which the par-
ticular sge_commd(8) to be used for Grid
Engine communication of the qstat client
resides. Per default the local host is used.
FILES
<sge_root>/<cell>/common/act_qmaster
Grid Engine master host file
SEE ALSO
sge_intro(1), qalter(1), qconf(1), qhold(1), qhost(1),
qmod(1), qsub(1), queue_conf(5), sge_commd(8), sge_execd(8),
sge_qmaster(8), sge_shepherd(8).
COPYRIGHT
See sge_intro(1) for a full statement of rights and permis-
sions.
Man(1) output converted with
man2html