Commit 2ac741d8 authored by Manuel Günther's avatar Manuel Günther
Browse files

Added support for --stop-on-failure option; bug fixes in the local scheduler;...

Added support for --stop-on-failure option; bug fixes in the local scheduler; updated documentation; small improvements.
parent 5bbf0cd9
......@@ -2,116 +2,121 @@
Parallel Job Manager
======================
The Job Manager is python wrapper around SGE utilities like ``qsub``, ``qstat``
and ``qdel``. It interacts with these tools to submit and manage grid jobs
making up a complete workflow ecosystem.
The Job Manager is python wrapper around SGE utilities like ``qsub``, ``qstat`` and ``qdel``.
It interacts with these tools to submit and manage grid jobs making up a complete workflow ecosystem.
Currently, it is set up to work with the SGE grid at Idiap, but it is also possible to modify it to be used in other SGE grids.
Since version 1.0 there is also a local submission system introduced. Instead
of sending jobs to the SGE grid, it executes them in parallel processes on the
local machine.
Since version 1.0 there is also a local submission system introduced.
Instead of sending jobs to the SGE grid, it executes them in parallel processes on the local machine, using a simple scheduling system.
Every time you interact with the Job Manager, a local database file (normally
named ``submitted.sql3``) is read or written so it preserves its state during
decoupled calls. The database contains all information about jobs that is
required for the Job Manager to:
* submit jobs (includes wrapped python jobs or Torch5spro specific jobs)
Submitting jobs to the SGE grid
+++++++++++++++++++++++++++++++
Every time you interact with the Job Manager, a local database file (normally named ``submitted.sql3``) is read or written so it preserves its state during decoupled calls.
The database contains all information about jobs that is required for the Job Manager to:
* submit jobs of any kind
* probe for submitted jobs
* query SGE for submitted jobs
* identify problems with submitted jobs
* cleanup logs from submitted jobs
* easily re-submit jobs if problems occur
* support for parametric (array) jobs
* submit jobs with dependencies, which automatically get killed on failures
Many of these features are also achievable using the stock SGE utilities, the
Job Manager only makes it dead simple.
Many of these features are also achievable using the stock SGE utilities, the Job Manager only makes it dead simple.
If you really want to use the stock SGE utilities, the gridtk defines some wrapper scripts that allows to use ``qsub``, ``qstat`` and ``qdel`` without the need of the SETSHELL command.
For example, you can easily use ``qstat.py`` to query the list of your jobs running in the SGE grid.
Submitting jobs to the SGE grid
+++++++++++++++++++++++++++++++
To interact with the Job Manager we use the ``jman`` utility. Make sure to have
your shell environment setup to reach it w/o requiring to type-in the full
path. The first task you may need to pursue is to submit jobs. Here is how::
Submitting a simple job
-----------------------
To interact with the Job Manager we use the ``jman`` utility.
Make sure to have your shell environment setup to reach it w/o requiring to type-in the full path.
The first task you may need to pursue is to submit jobs.
Here is how::
$ jman submit myscript.py --help
Submitted 6151645 @all.q (0 seconds ago) -S /usr/bin/python myscript.py --help
... Added job '<Job: 1> : submitted -- /usr/bin/python myscript.py --help' to the database
... Submitted job '<Job: 6151645> : queued -- /usr/bin/python myscript.py --help' to the SGE grid.
.. note::
The command ``submit`` of the Job Manager will submit a job that will run in
a python environment. It is not the only way to submit a job using the Job
Manager. You can also use `submit`, that considers the command as a self
sufficient application. Read the full help message of ``jman`` for details and
instructions.
The command ``submit`` of the Job Manager will submit a job that will run in a python environment.
It is not the only way to submit a job using the Job Manager.
You can also use ``submit`` a job that considers the command as a self sufficient application.
Read the full help message of ``jman`` for details and instructions.
Submitting a parametric job
---------------------------
Parametric or array jobs are jobs that execute the same way, except for the
environment variable ``SGE_TASK_ID``, which changes for every job. This way,
your program controls, which bit of the full job has to be executed in each
(parallel) instance. It is great for forking thousands of jobs into the grid.
Parametric or array jobs are jobs that execute the same way, except for the environment variable ``SGE_TASK_ID``, which changes for every job.
This way, your program controls, which bit of the full job has to be executed in each (parallel) instance.
It is great for forking thousands of jobs into the grid.
The next example sends 10 copies of the ``myscript.py`` job to the grid with
the same parameters. Only the variable ``SGE_TASK_ID`` changes between them::
The next example sends 10 copies of the ``myscript.py`` job to the grid with the same parameters.
Only the variable ``SGE_TASK_ID`` changes between them::
$ jman submit -t 10 myscript.py --help
Submitted 6151645 @all.q (0 seconds ago) -S /usr/bin/python myscript.py --help
$ jman -vv submit -t 10 myscript.py --help
... Added job '<Job: 2> : submitted -- /usr/bin/python myscript.py --help' to the database
... Submitted job '<Job: 6151646> : queued -- /usr/bin/python myscript.py --help' to the SGE grid.
The ``-t`` option in ``jman`` accepts different kinds of job array
descriptions. Have a look at the help documentation for details with ``jman
--help``.
The ``-t`` option in ``jman`` accepts different kinds of job array descriptions.
Have a look at the help documentation for details with ``jman --help``.
Probing for jobs
----------------
Once the job has been submitted you will noticed a database file (by default
called ``submitted.db``) has been created in the current working directory. It
contains the information for the job you just submitted::
Once the job has been submitted you will noticed a database file (by default called ``submitted.sql3``) has been created in the current working directory.
It contains the information for the job you just submitted::
$ jman list
job-id queue age arguments
======== ===== === =======================================================
6151645 all.q 2m -S /usr/bin/python myscript.py --help
From this dump you can see the SGE job identifier, the queue the job has been
submitted to and the command that was given to ``qsub``. The ``list`` command
from ``jman`` will show the current status of the job, which is updated
automatically as soon as the grid job finishes.
job-id queue status job-name dependencies submitted command line
==================== ========= ============== ==================== ============================== ===========================================
6151645 all.q queued None [] /usr/bin/python myscript.py --help
6151646 [1-10:1] all.q queued None [] /usr/bin/python myscript.py --help
From this dump you can see the SGE job identifier including the number of array jobs, the queue the job has been submitted to, the current status of the job in the SGE grid, the dependencies of the job and the command that was executed in the SGE grid.
The ``list`` command from ``jman`` will show the current status of the job, which is updated automatically as soon as the grid job finishes.
Several calls to ``list`` might end up in
.. note::
This feature is new since version 1.0.0. There is no need to refresh the
database any more.
Submitting dependent jobs
-------------------------
Sometimes, the execution of one job might depend on the execution of another
job. The JobManager can take care of this, simply by adding the id of the
job that we have to wait for::
Sometimes, the execution of one job might depend on the execution of another job.
The JobManager can take care of this, simply by adding the id of the job that we have to wait for::
$ jman submit --dependencies 6151645 myscript.py --help
Submitted 6151646 @all.q (0 seconds ago) -S /usr/bin/python myscript.py --help
$ jman -vv submit --dependencies 6151645 -- /usr/bin/python myscript.py --help
... Added job '<Job: 3> : submitted -- /usr/bin/python myscript.py --help' to the database
... Submitted job '<Job: 6151647> : queued -- /usr/bin/python myscript.py --help' to the SGE grid.
Now, the new job will only be run after the first one finished.
.. note::
Please note the ``--`` between the list of dependencies and the command.
Inspecting log files
--------------------
If jobs finish, the result of the executed job will be shown. In case it is
non-zero, might want to inspect the log files as follows::
If jobs finish, the result of the executed job will be shown in the ``list``.
In case it is non-zero, might want to inspect the log files as follows::
$ jman report --errors-only
Job 6151645 @all.q (34 minutes ago) -S /usr/bin/python myscript.py --help
Command line: (['-S', '/usr/bin/python', '--', 'myscript.py', '--help'],) {'deps': [], 'stderr': 'logs', 'stdout': 'logs', 'queue': 'all.q', 'cwd': True, 'name': None}
6151645 stdout (/remote/filer.gx/user.active/aanjos/work/spoofing/idiap-gridtk/logs/shell.py.o6151645)
6151645 stderr (/remote/filer.gx/user.active/aanjos/work/spoofing/idiap-gridtk/logs/shell.py.e6151645)
Traceback (most recent call last):
...
...
<Job: 6151646 - 'jman'> : failure (2) -- /usr/bin/python myscript.py --help
/usr/bin/python: can't open file 'myscript.py': [Errno 2] No such file or directory
Hopefully, that helps in debugging the problem!
......@@ -119,84 +124,119 @@ Hopefully, that helps in debugging the problem!
Re-submitting the job
---------------------
If you are convinced the job did not work because of external conditions (e.g.
temporary network outage), you may re-submit it, *exactly* like it was
submitted the first time::
If you are convinced the job did not work because of external conditions (e.g. temporary network outage), you may re-submit it, *exactly* like it was submitted the first time::
$ jman resubmit --clean
Re-submitted job 6151663 @all.q (1 second ago) -S /usr/bin/python myscript.py --help
removed `logs/myscript.py.o6151645'
removed `logs/myscript.py.e6151645'
deleted job 6151645 from database
$ jman -vv resubmit --job-id 6151645
...Deleting job '6151645'
...Submitted job '<Job: 6151673> : queued -- /usr/bin/python myscript.py --help' to the SGE grid.
The ``--clean`` flag tells the job manager to clean-up the old log files as it
re-submits the new job. Notice the new job identifier has changed as expected.
By default, the log files of the old job are deleted during re-submission.
If for any reason you want to keep the old log files, use the ``--keep-logs`` option.
Notice the new job identifier has changed as expected.
Stopping a grid job
-------------------
In case you found an error in the code of a grid job that is currently
executing, you might want to kill the job in the grid. For this purpose, you
can use the command::
In case you found an error in the code of a grid job that is currently executing, you might want to kill the job in the grid.
For this purpose, you can use the command::
$ jman stop
The job is removed from the grid, but all log files are still available. A
common use case is to stop the grid job, fix the bugs, and re-submit it.
The job is removed from the grid, but all log files are still available.
A common use case is to stop the grid job, fix the bugs, and re-submit it.
Cleaning-up
-----------
If the job in question will not work no matter how many times we re-submit it,
you may just want to clean it up and do something else. The job manager is
here for you again::
If the job in question will not work no matter how many times we re-submit it, you may just want to clean it up and do something else.
The Job Manager is here for you again::
$ jman delete
Cleaning-up logs for job 6151663 @all.q (5 minutes ago) -S /usr/bin/python myscript.py --help
removed `logs/myscript.py.o6151663'
removed `logs/myscript.py.e6151663'
deleted job 6151663 from database
$ jman -vvv delete
... Deleting job '8258327' from the database.
In case, jobs are still running in the grid, they will be stopped before they
are removed from the database. Inspection on the current directory will now
show you everything concerning the jobs is gone.
In case, jobs are still running or queued in the grid, they will be stopped before they are removed from the database.
By default, all logs will be deleted with the job.
Inspection on the current directory will now show you everything concerning the jobs is gone.
New from version 1.0
++++++++++++++++++++
If you know the gridtk in versions below 1.0, you might experience some differences.
The main advantages of the new version are:
* When run in the grid, the jobs now register themselves in the database.
There is no need to refresh the database by hand any more.
This includes that the result (an integral value) of the job execution is available once the job is finished.
Hence, there is no need to rely on the output of the error log any more.
.. note::
In case the job died in the grid, e.g., because of a timeout, this mechanism unfortunately still doesn't work.
Please try to use ``jman -vv communicate`` to see if these kinds of errors happened.
* Jobs are now stored in a proper .sql3 database.
Additionally to the jobs, each array job now has its own SQL model, which allows to store status and results of each array job.
To ``list`` the array jobs as well, please use the ``--print-array-jobs`` option.
* In case you have submitted a long list of commands with inter-dependencies, the Job Manager can now kill waiting jobs in case a dependent job failed.
Simply use the ``--stop-on-failure`` option during the submission of the jobs.
* Now, the verbosity of the gridtk can be selected more detailed.
Simply use the ``-v`` option several times to get 0: ERROR, 1: WARNING, 2: INFO, 3: DEBUG outputs.
A good choose is probably the ``-vv`` option to enable INFO output.
Please note that this is not propagated to the jobs that are run in the grid.
.. note::
The ``-v`` options must directly follow the ``jman`` command, and it has to be before the action (like ``submit`` or ``list``) is chosen.
The ``--database`` is now also a default option, which has to be at the same position.
* One important improvement is that you now have the possibility to execute the jobs **in parallel** on the **local machine**.
Please see next section for details.
Running jobs on the local machine
+++++++++++++++++++++++++++++++++
---------------------------------
The JobManager is designed such that it supports mainly the same infrastructure when submitting jobs locally or in the SGE grid.
To submit jobs locally, just add the ``--local`` option to the jman command::
$ jman --local -vv submit /usr/bin/python myscript.py --help
One important difference to the grid submission is that the jobs that are submitted to the local machine **do not run immediately**, but are only collected in the ``submitted.sql3`` database.
To run the collected jobs using 4 parallel processes, simply use::
$ jman --local -vv run-scheduler --parallel 4
and all jobs that have not run yet are executed, keeping an eye on the dependencies.
.. note::
The scheduler will run until it is stopped using Ctrl-C.
Hence, as soon as you submit new (local) jobs to the database, it will continue running these jobs.
If you want the scheduler to stop after all scheduled jobs ran, please use the ``--die-when-finished`` option.
The JobManager is designed such that it supports mainly the same infrastructure
when submitting jobs locally or in the SGE grid. To submit jobs locally, just
add the ``--local`` option to the jman command::
Another difference is that by default, the jobs write their results into the command line and not into log files.
If you want the log file behavior back, specify the log directory during the submission::
$ jman --local submit myscript.py --help
$ jman --local -vv submit --log-dir logs myscript.py --help
Of course, you can choose a different log directory (also for the SGE submission).
Differences between local and grid execution
--------------------------------------------
Furthermore, the job identifiers during local submission usually start from 1 and increase.
Also, during local re-submission, the job ID does not change.
One important difference to the grid submission is that the jobs that are
submitted to the local machine **do not run immediately**, but are only
collected in the ``submitted.sql3`` database. To run the collected jobs using 4
parallel processes, simply use::
$ jman --local execute --parallel 4
Using the local machine for debugging
-------------------------------------
and all jobs that have not run yet are executed, keeping an eye on the
dependencies.
One possible use case for the local job submission is the re-submission of jobs to the local machine.
In this case, you might re-submit the grid job locally::
Another difference is that by default, the jobs write their results into the
command line and not into log files. If you want the log file behavior back,
specify the log directory during the submission::
$ jman --local -vv resubmit --job-id 6151646 --keep-logs
$ jman --local submit --log-dir logs myscript.py --help
(as mentioned above, no new ID is assigned) and run the local scheduler::
Of course, you can choose a different log directory (also for the SGE
submission).
$ jman --local -vv run-scheduler --no-log-files --job-ids 6151646
Furthermore, the job identifiers during local submission usually start from 1
and increase. Also, during local re-submission, the job ID does not change, and
jobs cannot be stopped using the ``stop`` command (you have to kill the
``jman --local --execute`` job first, and then all running jobs).
to print the output and the error to console instead of to log files.
......@@ -13,7 +13,7 @@ import copy, os, sys
import gdbm, anydbm
from cPickle import dumps, loads
from .tools import makedirs_safe, logger, try_get_contents, try_remove_files
from .tools import makedirs_safe, logger
from .manager import JobManager
......@@ -34,15 +34,23 @@ class JobManagerLocal(JobManager):
JobManager.__init__(self, **kwargs)
def submit(self, command_line, name = None, array = None, dependencies = [], log_dir = None, **kwargs):
def submit(self, command_line, name = None, array = None, dependencies = [], log_dir = None, dry_run = False, stop_on_failure = False, **kwargs):
"""Submits a job that will be executed on the local machine during a call to "run".
All kwargs will simply be ignored."""
# add job to database
self.lock()
job = add_job(self.session, command_line=command_line, name=name, dependencies=dependencies, array=array, log_dir=log_dir)
logger.debug("Added job '%s' to the database" % job)
job = add_job(self.session, command_line=command_line, name=name, dependencies=dependencies, array=array, log_dir=log_dir, stop_on_failure=stop_on_failure)
logger.info("Added job '%s' to the database" % job)
if dry_run:
print "Would have added the Job", job, "to the database to be executed locally."
self.session.delete(job)
logger.info("Deleted job '%s' from the database due to dry-run option" % job)
job_id = None
else:
job_id = job.id
# return the new job id
job_id = job.id
self.unlock()
return job_id
......@@ -57,8 +65,8 @@ class JobManagerLocal(JobManager):
# check if this job needs re-submission
if running_jobs or job.status in accepted_old_status:
# re-submit job to the grid
logger.debug("Re-submitted job '%s' to the database" % job)
job.submit()
logger.info("Re-submitted job '%s' to the database" % job)
job.submit('local')
self.session.commit()
self.unlock()
......@@ -71,7 +79,7 @@ class JobManagerLocal(JobManager):
jobs = self.get_jobs(job_ids)
for job in jobs:
if job.status == 'executing':
logger.debug("Reset job '%s' in the database" % job)
logger.info("Reset job '%s' in the database" % job)
job.status = 'submitted'
self.session.commit()
......@@ -83,7 +91,7 @@ class JobManagerLocal(JobManager):
job, array_job = self._job_and_array(job_id, array_id)
if job.status == 'executing':
logger.debug("Reset job '%s' in the database" % job)
logger.info("Reset job '%s' in the database" % job)
job.status = 'submitted'
if array_job is not None and array_job.status == 'executing':
......@@ -119,21 +127,23 @@ class JobManagerLocal(JobManager):
return None
def _result_files(self, process, job_id, array_id = None):
def _result_files(self, process, job_id, array_id = None, no_log = False):
"""Finalizes the execution of the job by writing the stdout and stderr results into the according log files."""
def write(file, process, std):
def write(file, std, process):
f = std if file is None else open(str(file), 'w')
f.write(process.read())
self.lock()
# get the files to write to
job, array_job = self._job_and_array(job_id, array_id)
if array_job:
if no_log:
out, err = None, None
elif array_job:
out, err = array_job.std_out_file(), array_job.std_err_file()
else:
out, err = job.std_out_file(), job.std_err_file()
log_dir = job.log_dir
log_dir = job.log_dir if not no_log else None
job_id = job.id
array_id = array_job.id if array_job else None
self.unlock()
......@@ -142,9 +152,9 @@ class JobManagerLocal(JobManager):
makedirs_safe(log_dir)
# write stdout
write(out, process.stdout, sys.stdout)
write(out, sys.stdout, process.stdout)
# write stderr
write(err, process.stderr, sys.stderr)
write(err, sys.stderr, process.stderr)
if log_dir:
j = self._format_log(job_id, array_id)
......@@ -155,12 +165,14 @@ class JobManagerLocal(JobManager):
def _format_log(self, job_id, array_id = None):
return ("%d (%d)" % (job_id, array_id)) if array_id is not None else ("%d" % job_id)
def run_scheduler(self, parallel_jobs = 1, sleep_time = 0.1):
def run_scheduler(self, parallel_jobs = 1, job_ids = None, sleep_time = 0.1, die_when_finished = False, no_log = False):
"""Starts the scheduler, which is constantly checking for jobs that should be ran."""
running_tasks = []
try:
while True:
# Flag that might be set in some rare cases, and that prevents the scheduler to die
repeat_execution = False
# FIRST, try if there are finished processes; this does not need a lock
for task_index in range(len(running_tasks)-1, -1, -1):
task = running_tasks[task_index]
......@@ -170,7 +182,7 @@ class JobManagerLocal(JobManager):
job_id = task[1]
array_id = task[2] if len(task) > 2 else None
# report the result
self._result_files(process, job_id, array_id)
self._result_files(process, job_id, array_id, no_log)
logger.info("Job '%s' finished execution" % self._format_log(job_id, array_id))
# in any case, remove the job from the list
......@@ -180,18 +192,25 @@ class JobManagerLocal(JobManager):
if len(running_tasks) < parallel_jobs:
# get all unfinished jobs:
self.lock()
jobs = self.get_jobs()
jobs = self.get_jobs(job_ids)
# put all new jobs into the queue
for job in jobs:
if job.status == 'submitted':
job.queue()
# get all unfinished jobs
unfinished_jobs = [job for job in jobs if job.status in ('queued', 'executing')]
# get all unfinished jobs that are submitted to the local queue
unfinished_jobs = [job for job in jobs if job.status in ('queued', 'executing') and job.queue_name == 'local']
for job in unfinished_jobs:
if job.array:
# find array jobs that can run
for array_job in job.array:
if array_job.status == 'queued':
queued_array_jobs = [array_job for array_job in job.array if array_job.status == 'queued']
if not len(queued_array_jobs):
job.finish(0, -1)
repeat_execution = True
else:
# there are new array jobs to run
for i in range(min(parallel_jobs - len(running_tasks), len(queued_array_jobs))):
array_job = queued_array_jobs[i]
# start a new job from the array
process = self._run_parallel_job(job.id, array_job.id)
running_tasks.append((process, job.id, array_job.id))
......@@ -215,6 +234,11 @@ class JobManagerLocal(JobManager):
self.session.commit()
self.unlock()
# if after the submission of jobs there are no jobs running, we should have finished all the queue.
if die_when_finished and not repeat_execution and len(running_tasks) == 0:
logger.info("Stopping task scheduler since there are no more jobs running.")
break
# THIRD: sleep the desired amount of time before re-checking
time.sleep(sleep_time)
......
......@@ -13,6 +13,7 @@ class JobManager:
def __init__(self, database, wrapper_script = './bin/jman', debug = False):
self._database = os.path.realpath(database)
self._engine = sqlalchemy.create_engine("sqlite:///"+self._database, echo=debug)
self._session_maker = sqlalchemy.orm.sessionmaker(bind=self._engine)
# store the command that this job manager was called with
self.wrapper_script = wrapper_script
......@@ -35,10 +36,10 @@ class JobManager:
"""Generates (and returns) a blocking session object to the database."""
if hasattr(self, 'session'):
raise RuntimeError('Dead lock detected. Please do not try to lock the session when it is already locked!')
Session = sqlalchemy.orm.sessionmaker()
if not os.path.exists(self._database):
self._create()
self.session = Session(bind=self._engine)
# now, create a session
self.session = self._session_maker()
logger.debug("Created new database session to '%s'" % self._database)
return self.session
......@@ -46,8 +47,8 @@ class JobManager:
"""Closes the session to the database."""
if not hasattr(self, 'session'):
raise RuntimeError('Error detected! The session that you want to close does not exist any more!')
self.session.close()
logger.debug("Closed database session of '%s'" % self._database)
self.session.close()
del self.session
......@@ -63,6 +64,7 @@ class JobManager:
logger.debug("Created new empty database '%s'" % self._database)
def get_jobs(self, job_ids = None):
"""Returns a list of jobs that are stored in the database."""
q = self.session.query(Job)
......@@ -100,6 +102,25 @@ class JobManager:
# set the 'executing' status to the job
job.execute(array_id)
if job.status == 'failure':
# there has been a dependent job that has failed before
# stop this and all dependent jobs from execution
dependent_jobs = job.get_jobs_waiting_for_us()
dependent_job_ids = set([dep.id for dep in dependent_jobs] + [job.id])
while len(dependent_jobs):
dep = dependent_jobs[0]
new = dep.get_jobs_waiting_for_us()
dependent_jobs += new
dependent_job_ids.update([dep.id for dep in new])
self.unlock()
try:
self.stop_jobs(list(dependent_job_ids))
logger.warn("Deleted dependent jobs '%s' since this job failed.")
except:
pass
return
# get the command line of the job
command_line = job.get_command_line()
self.session.commit()
......@@ -116,6 +137,7 @@ class JobManager:
jobs = self.get_jobs((job_id,))
if not len(jobs):
# it seems that the job has been deleted in the meanwhile
logger.error("The job with id '%d' could not be found in the database!" % job_id)
return
job = jobs[0]
......@@ -128,9 +150,17 @@ class JobManager:
def list(self, job_ids, print_array_jobs = False, print_dependencies = False, long = False):
"""Lists the jobs currently added to the database."""
# configuration for jobs
fields = ("job-id", "queue", "status", "job-name", "arguments")
lengths = (20, 9, 14, 20, 43)
format = "{:^%d} {:^%d} {:^%d} {:^%d} {:^%d}" % lengths
if print_dependencies:
fields = ("job-id", "queue", "status", "job-name", "dependencies", "submitted command line")
lengths = (20, 9, 14, 20, 30, 43)
format = "{:^%d} {:^%d} {:^%d} {:^%d} {:^%d} {:<%d}" % lengths
dependency_length = lengths[4]
else:
fields = ("job-id", "queue", "status", "job-name", "submitted command line")
lengths = (20, 9, 14, 20, 43)
format = "{:^%d} {:^%d} {:^%d} {:^%d} {:<%d}" % lengths
dependency_length = 0
array_format = "{:>%d} {:^%d} {:^%d}" % lengths[:3]
delimiter = format.format(*['='*k for k in lengths])
array_delimiter = array_format.format(*["-"*k for k in lengths[:3]])
......@@ -143,7 +173,7 @@ class JobManager:
self.lock()
for job in self.get_jobs(job_ids):
print job.format(format, print_dependencies, None if long else 43)
print job.format(format, dependency_length, None if long else 43)
if print_array_jobs and job.array:
print array_delimiter
for array_job in job.array:
......@@ -232,10 +262,10 @@ class JobManager:
if array_jobs:
job = array_jobs[0].job
for array_job in array_jobs:
logger.debug("Deleting array job '%d' of job '%d'" % array_job.id, job.id)
logger.debug("Deleting array job '%d' of job '%d' from the database." % array_job.id, job.id)
_delete(array_job)
if not job.array:
logger.info("Deleting job '%d'" % job.id)
logger.info("Deleting job '%d' from the database." % job.id)
_delete(job, True)
else:
......@@ -245,10 +275,10 @@ class JobManager:
# delete all array jobs
if job.array:
for array_job in job.array:
logger.debug("Deleting array job '%d' of job '%d'" % (array_job.id, job.id))