Skip to content

Rewrite for SLURM

Amir MOHAMMADI requested to merge slurm into main

The basic usage is like this:

# e.g.
gridtk submit --partition gpu --gpus 1 --job-name train-gpu --- python --batch-size 16

or if you have a bash script already you don't need the --- anymore:

# e.g.
gridtk submit

To avoid entering your project each time, do like this:

echo "export SBATCH_ACCOUNT=my_project" >> ~/.bashrc

To see what jobs you have already submitted, run:

gridtk list

To see the logs of one of the jobs, run:

gridtk report -j 1

run gridtk --help for more information:

$ gridtk --help
Usage: gridtk [OPTIONS] COMMAND [ARGS]...

  GridTK command line interface.

  -d, --database FILE       Path to the database file.  [default: jobs.sql3]
  -l, --logs-dir DIRECTORY  Path to the logs directory.  [default: logs]
  --help                    Show this message and exit.

  submit    Submit a job to the queue.
  resubmit  Resubmit a job to the queue.
  stop      Stop a job from running.
  list      List jobs in the queue, similar to sacct and squeue.
  report    Report on jobs in the queue.
  delete    Delete a job from the queue.

You can still use slurm commands like scontrol sacct and squeue to get more information about your job. Just use your slurm-id shown in gridtk list to get more info about your submissions.

Edited by Amir MOHAMMADI

Merge request reports