A SLURM helper script
Find a file
2026-05-26 10:41:39 +00:00
queue.py feat: remove ddp/opt wrappers, support queueing .sh files, support queueing commands 2026-05-26 10:41:39 +00:00
README.md feat: add readme 2026-05-22 13:33:53 +00:00

queue.py

A lightweight CLI wrapper around Slurm for submitting, monitoring, and managing jobs.

Installation

Clone the repository into ~/.local/share/queue.py:

git clone <repo-url> ~/.local/share/queue.py

Make the script executable:

chmod +x ~/.local/share/queue.py/queue.py

Add it to your PATH by appending this to your ~/.bashrc:

export PATH="$HOME/.local/share/queue.py:$PATH"

Then reload your shell:

source ~/.bashrc

Bash completions

Add this line to your ~/.bashrc (after the PATH export above):

eval "$(queue.py completions)"

This enables tab-completion for subcommands, flags, cluster names, job IDs, and Python files.

Configuration

Edit the top of queue.py to change defaults:

DEFAULT_CLUSTER = "turing"   # Slurm partition to submit to
DEFAULT_CORES = 2            # CPUs per job

Usage

Show the current queue

queue.py

Runs squeue -a and prints all jobs across all partitions.

Submit a job

queue.py train.py
queue.py train.py --cluster gpu --cores 8
queue.py train.py --name my-experiment --env SEED=42 --env CUDA_VISIBLE_DEVICES=0,1

Submits train.py to Slurm via sbatch and automatically attaches to the live output. Press Ctrl+C to detach — the job keeps running and logs are preserved.

Options:

  • --cluster NAME — target Slurm partition (default: turing)
  • --cores N — number of CPU cores (default: 2, clamped to cluster max)
  • --venv PATH — path to a Python virtualenv (default: ~/.local/share/queue.py/venv)
  • --env VAR=VALUE — set an environment variable in the job, repeatable
  • --name JOB_NAME — custom job name (default: derived from the script filename)

Attach to a running job

queue.py attach 12345

Tails the job's stdout and stderr in real time. Stdout prints normally, stderr prints in red. Detach with Ctrl+C — the job and logs are unaffected.

View logs for a finished job

queue.py logs 12345

Opens the job's stdout and stderr in less with color. Stdout appears under a green header, stderr under a red header.

Cancel a job

queue.py cancel 12345

View job history

queue.py history

Shows a table of all jobs submitted through queue.py with their current status (queried live from Slurm), cluster, cores, submission time, and command.

Manage the virtualenv

queue.py venv

Opens a new shell with the job virtualenv activated. Your ~/.bashrc is sourced first, so aliases and prompt customizations are preserved. Install dependencies here and they'll be available to submitted jobs:

queue.py venv
pip install torch numpy wandb
exit

Show help

queue.py help

File locations

What Where
Job history ~/.local/share/queue.py/history.json
Job logs ~/.cache/queue.py/logs/
Virtualenv ~/.local/share/queue.py/venv/

How it works

When you submit a job, queue.py generates a bash script that activates the virtualenv and runs your Python file, then passes it to sbatch. It parses the job ID from the response, records it in the history file, and immediately attaches to the log files so you see output as it happens.

The virtualenv is created automatically on first use if it doesn't exist. All jobs share the same venv by default, so you install dependencies once and they're available everywhere. Use --venv to override per-job if needed.

Cluster information (available partitions, max CPUs) is queried from sinfo so core counts are validated and clamped before submission.