New Arrivals/Restock

SLURM WORKLOAD MANAGER: THE COMPLETE GUIDE TO HPC JOB SCHEDULING: Cluster Resource Management, Batch Processing, and Parallel Computing for Supercomputers and Research ANIK RAO Labs Paperback – January 28, 2026

flash sale iconLimited Time Sale
Until the end
10
23
37

$18.40 cheaper than the new price!!

Free shipping for purchases over $99 ( Details )
Free cash-on-delivery fees for purchases over $99
Please note that the sales price and tax displayed may differ between online and in-store. Also, the product may be out of stock in-store.
New  $30.66
quantity

Product details

Management number 219222611 Release Date 2026/05/03 List Price $12.26 Model Number 219222611
Category

Run your Slurm clusters with confidence using an operator grade guide to HPC job scheduling.If you are responsible for a research or production cluster, you know how quickly Slurm can become opaque. Jobs sit pending on idle nodes, GPU queues lock up, accounting reports drift from reality, and every outage turns into a scramble to reconstruct what happened.This book walks you through the full lifecycle of operating Slurm for supercomputers and research labs, from building an accurate picture of your nodes and partitions to tuning policy, debugging workload behavior, and recovering cleanly from incidents. You get concrete workflows rooted in real commands, Slurm configuration, and scenario driven examples with administrators, PIs, ML engineers, and facility operators.Map partitions, TRES, GRES, and node states into a schedulability “truth table” you can defendRead scontrol, sinfo, sacct, and sreport outputs as linked signals instead of isolated commandsDesign Association and QOS structures that match funding, labs, and project allocationsUnderstand Backfill and Fairshare so you can predict start times instead of guessingRightsize CPU, memory, and GRES requests to real node topology and GPU layoutsPackage repeatable sbatch templates, Job Array pipelines, and deterministic outputs for labsEnforce isolation with cgroup backed limits on cpu, memory, and GPUs without breaking workloadsUse Reservation design to protect maintenance windows and training sessions without stranding capacityBuild drift detection signals for phantom capacity, Job Array abuse, and missing SlurmDBD recordsStandardize a metrics dashboard and guard slurmrestd based automation behind safe access patternsApply tested recovery playbooks for controller failover, config distribution issues, and MPI launch regressionsThe pages are rich with working sbatch headers, srun usage, Slurm configuration snippets, cgroup settings, and small scripts so you can move directly from reading to testing on a pilot partition or lab account.Grab your copy today and turn your Slurm cluster into a system you can explain, defend, and reliably recover. Read more

ISBN13 979-8246048771
Language English
Publisher Independently published
Dimensions 7 x 0.69 x 10 inches
Item Weight 1.47 pounds
Print length 304 pages
Publication date January 28, 2026

Correction of product information

If you notice any omissions or errors in the product information on this page, please use the correction request form below.

Correction Request Form

Product Review

You must be logged in to post a review