Slurm Workload Manager (Fimm)

Overview

Slurm is an open-source workload manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.

Commands

sinfo - reports the state of partitions and nodes managed by SLURM.

squeue - reports the state of jobs or job steps.

scontrol show partition

sbatch is used to submit a job script for later execution.

scancel is used to cancel a pending or running job or job step

srun is used to submit a job for execution or initiate job steps in real time

For more information regarding to slurm command please check man page.

man <command>

sbatch script

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --mem-per-cpu=1G
#SBATCH --time=30:00     # default time is 15 minutes 
#SBATCH --output=my.stdout
#SBATCH --mail-user=saerda@uib.no
#SBATCH --mail-type=ALL
#SBATCH --job-name="slurm_job" 
#
# Put commands for executing job below this line
# 
sleep 30 
hostname

MPI program

#!/bin/bash
#CPU accounting is not enforced currently.
#SBATCH -A <account>
#SBATCH -N 2
#use --exclusive to get the whole nodes exclusively for this job
#SBATCH --exclusive
#SBATCH --time=01:00:00
#SBATCH -c 2
srun -n 10 ./mpi_program

Anonymous

Search

Slurm Workload Manager (Fimm)

Namespaces

More

Page actions

Contents

Overview

Commands

sbatch script

MPI program

Navigation

Navigation

Basic Information

Information

Wiki tools

Wiki tools

Anonymous

Search

Slurm Workload Manager (Fimm)

Overview

Commands

sbatch script

MPI program

Navigation

Wiki tools

Page tools