Writing PBS Job Scripts

In this section we will copy some example scripts into our own directory and submit a job to the PBS scheduler. The end result is that we will have calculated some prime numbers. This example will also use the /scratch directory for reading and writing our files. You should always be using “scratch”.

Important things to remember:
  1. Do not to run large computations on the login node, use PBS.
  2. Use the /scratch/ directory for reading and writing files.

The login node is so you can login, edit your code, compile it, and perhaps run tests, using a small test data set. Your real computational work needs to be run using a PBS submission script so that it can be distributed to one of the dedicated compute nodes.
This page explains how you can do this.

Summary of running a job

  • Determine the resources required for your job.
  • Create a Job Script, this wraps your job in a shell script, telling PBS your requirements.
  • Submit the job using qsub.
  • Monitor the job using qstat.
  • Delete the job, if required, using qdel.

Never run large programs or large data sets on the cluster’s head node directly, use PBS to schedule your job into the queue. This ensures efficient allocation of resources for everyone. If you need to test a script, run a smaller set of test data, preferably via PBS, instead.

Copy the example scripts to your own directory

The example scripts that you can use to practice submitting some short test jobs are in /shared/eresearch/pbs_job_examples/.

Copy these into your own directory using the following commands:

$ cd            <-- This will take you to the top of your home directory. 
$ mkdir jobs    <-- This creates the directory "jobs".
$ cd jobs       <-- This changes into the directory "jobs".

Now you should be in the new directory jobs and we can copy into there the example primes programs.
In the command below don’t forget the dot at the end.

$ cp -r /shared/eresearch/pbs_job_examples/primes .   

Now change directory into the new “primes” directory.

$ cd primes     

This will have recursively copied the directory primes and its contents to your own directory. You will now be in the directory “primes” and you can have a look at the scripts there.

To view a file like primes.py or primes_job.sh use the less command less primes.py. Hitting the space bar moves down a page and hitting the q key quites the viewer.

Determine the resource requirements

To make effective use of the PBS queueing system, you will need to know how much resources your job will be using. When your job starts, PBS will make sure that appropriate resources are available for your job to run up to the maximum you have specified.

The resources can be specified by:

  • CPU cores - If your application is multi-threaded and can make use of more than one core, you will need to specify the number of cores your script will use.

  • Memory - This is how much memory your application will use. On a new piece of software or dataset, you might not know how much will be consumed. In such a case, start with a generous number and tune downwards. The more accurate you get, the more likely your job is to be scheduled during busy periods where small amounts of memory are available.

  • Walltime - This is the maximum amount of time you want your program to run for, and afer this time the PBS scheduler will kill your job. Start by estimating a generous walltime based on test runs and tune downwards. The smaller the wall-time, the more likely the job is to be scheduled during busy periods.

For the example primes.py program to calculate primes from 100,000 to 300,000 we will use 1 CPU, 5 GB RAM and set a wall time of 5 minutes.

Create a Job Script

Your job script sets up the HPC resources we want PBS to reserve for our job. It would contain the following:

  • A name for your job so you will recognise it in the qstat list.
  • Your resource requirements for PBS to schedule your job - this needs to be at the top of your script for PBS to read it, before the first executable line in the script.
  • Any copying of data, setup of working directories and other pre-job administration that needs to take place
  • The job itself
  • Cleaning up temporary data, copying data to a longer term directory and other post-job administration

Have a look at the job script called submit_typical.sh from the examples directory /shared/eresearch/pbs_job_examples/primes/. This is well commented.

A shortened version of this, with less comments, is shown below. Here this job is requesting 4 cores and up to 30 minutes to complete, so we have specified a wall time of 40 minutes to ensure it will finish with the walltime.

#!/bin/bash

#PBS -N primes
#PBS -l ncpus=4
#PBS -l mem=20gb
#PBS -l walltime=00:40:00

# Create a unique /scratch directory.
SCRATCH="/scratch/${USER}_${PBS_JOBID%.*}"
mkdir ${SCRATCH}

# Change to the PBS working directory where qsub was started from.
cd ${PBS_O_WORKDIR}

# Copy your input data to this scratch directory.
cp input.dat ${SCRATCH}

# Change directory to the scratch directory and run your program.
# my_program uses input.dat and creates an output file called output.dat
cd ${SCRATCH}
${PBS_O_WORKDIR}/my_program

# Copy results back to your working directory. 
mv ${SCRATCH}/output.dat ${PBS_O_WORKDIR}/

# Clean up 
cd ${PBS_O_WORKDIR}
rmdir ${SCRATCH}

Submit your job

Here we submit our job to the queue. Type man qsub for the online manual pages.

$ qsub submit_typical.sh.sh
11153.hpcnode1

Qsub will return the assigned job ID. This is typically a number, following by the name of the server you have submitted the job from. You can simply refer to the number in place of the full job ID.

Monitor your job status and list jobs

Below is an example of the output you will see. Type man qstat for the online manual pages.

$ qstat
Job id            Name             User            Time Use S Queue
----------------  ---------------- --------------  -------- - -----
211.hpcnode1      scaffold.build.  110234          570:36:5 R workq
235.hpcnode1      Histomonas.map.  100123                 0 Q workq
236.hpcnode1      run_job.sh       999777                 0 Q workq

Name is the name of your submitted script. User is your UTS staff user number. Time is the CPU time used. The S column indicates the job’s state as in the table below:

Q : Job is queued.
R : Job is running.
E : Job is exiting after having run.
F : Job is finished.
H : Job is held.
S : Job is suspended.

The Queue will be workq unless you have specified another queue to use in your job submission script.

More information can be listed by using the using command line options to qstat like -n1 which shows the node that the program is executing on.

$ qstat -n1 
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
69580.hpcnode1  111111   workq    primes      22234   1   8    5gb 120:0 R 23:47 hpcnode6/2*8
69581.hpcnode1  111111   workq    primes      22698   1   8    5gb 120:0 R 23:47 hpcnode6/3*8
$ 

To list your finsihed jobs use -x (for expired). So for instance:

$ qstat -x 
                                                            Req'd  Req'd   Elap
Job ID          Username Queue    Jobname    SessID NDS TSK Memory Time  S Time
--------------- -------- -------- ---------- ------ --- --- ------ ----- - -----
1152.hpcnode1   999777     workq  primes      56678   1   1  250gb   --  F 00:09

To delete or cancel your job

To delete your job from the queue, use the qdel command:

$ qdel job_id

e.g. "qdel 69580.hpcnode1"

Type man qdel for the online manual pages.

To get detailed information on your job

To show details for a specific job use qstat -f job_id. For instance, for the job “primes” use:

$ qstat -f 1152.hpcnode1
Job Id: 1152.hpcnode1
    Job_Name = primes
    Job_Owner = 999777@hpcnode1
    resources_used.cpupercent = 0
    resources_used.cput = 00:01:20
    resources_used.mem = 3220kb
    resources_used.ncpus = 1
    resources_used.vmem = 315200kb
    resources_used.walltime = 00:01:59
    job_state = R
    queue = workq
    Error_Path = hpcnode1:/shared/homes/999777/jobs/primes/primes.e1152
    exec_host = hpc2/0
    Mail_Points = abe
    Mail_Users = 999777@uts.edu.au
    Output_Path = hpcnode1:/shared/homes/999777/jobs/primes/primes.o1152
    Rerunable = True
    Resource_List.mem = 250gb
    Resource_List.ncpus = 1
    Resource_List.nodect = 1
    Resource_List.place = pack
    Resource_List.select = 1:mem=250gb:ncpus=1:vmem=250gb
    Resource_List.vmem = 250gb
    stime = Wed Apr 10 15:25:50 2013
    jobdir = /shared/homes/999777
    Variable_List = PBS_O_SYSTEM=Linux,PBS_O_SHELL=/bin/bash,
    PBS_O_HOME=/shared/homes/999777,PBS_O_LOGNAME=999777,
    PBS_O_WORKDIR=/shared/homes/999777/jobs/primes/primes,
    PBS_O_LANG=en_US.UTF-8,
    PBS_O_PATH=/usr/local/bin:/bin:/usr/bin:/bin,
    PBS_O_MAIL=/var/spool/mail/999777,PBS_O_QUEUE=workq,PBS_O_HOST=hpcnode1
    comment = Job run at Wed Apr 10 at 15:25 on (hpc2:mem=262144000kb:ncpus=1)
    etime = Wed Apr 10 15:25:50 2013
    Submit_arguments = primes

Finishing up

A copy of the output of your PBS job stdout and stderr streams gets created in the directory you called PBS from as *.e and a *.o named files with the job_id appended.

An example of what the program primes.py and job number 1152 would produce is:

primes.e1152 - this should always be zero sized, i.e. empty, as it contains any errors your program may have produced.

primes.o1152 - this will contain any screen output that your program would have produced.

This is custom footer