Skip to content

PBS Questions

What is resources_used.cpupercent ?

I’ve run a PBS job and qstat shows the following: resources_used.cpupercent = 196 How is it possible to use more than 100% of cpus?

The PBS system regularly polls all of the running jobs every 120 seconds. At each polling cycle it calculates an integer value called cpupercent. This is a moving weighted average of CPU usage for the cycle, given as the average percentage usage of one CPU. For example, a value of 50 means that during a certain period, the job used 50 percent of one CPU. A value of 300 means that during the period, the job used an average of three CPUs.

Therefore, if your job just uses one core it should be showing 100 cpupercent. If your job can use multiple cores and you have asked for 32 cores it should be showing 3200 cpupercent. If its a lot less then your job is not running efficiently.

What is resources_used.vmem ?

I’ve run a PBS job and qstat shows the following:

resources_used.mem = 656112kb     <== about 0.6 GB
resources_used.vmem = 5038504kb   <== about 5 GB

What is “vmem”?

This is “virtual memory. Virtual memory includes the physical RAM that was used plus any files on disk that have been mapped into memory plus any “swap” memory used. Some applications allocate lots of virtual memory even if they only need a fraction. In most cases you can just ignore this value.

What is resources_used.cput ?

Suppose we have this job:

$ qstat
Job id            Name             User              Time Use S Queue
----------------  ---------------- ----------------  -------- - -----
1975491.hpcnode0  blast_genome23   uXXXXXXXX         82:33:35 R gpuq

The “Time” column above shows 82 hours. But this job has not been running for 82 hours. If I look at the “time” attributes of that job I see “walltime = 41 hours”.

$ qstat -f 1975491.hpcnode0 | grep time
resources_used.walltime = 41:20:27      <== walltime = 41 hours
ctime = Mon Oct 10 21:52:20 2022
mtime = Wed Oct 12 15:12:47 2022
qtime = Mon Oct 10 21:52:20 2022
Resource_List.walltime = 96:00:00

To understand this look at the number of CPUs used:

$ qstat -f 1975491.hpcnode0 | grep cpu
resources_used.cpupercent = 199
resources_used.cput = 82:25:35
resources_used.ncpus = 5
exec_vnode = (hpcnode10:mem=134217728kb:ncpus=5:ngpus=2)
Resource_List.ncpus = 5
Resource_List.select = 1:mem=128gb:ncpus=5

This job had a cpupercent = 199 i.e. 200% so it used fully two cores or cpus. The job had asked for 5 cpus, if it had of used those 5 cpus all the time then the cpupercent would have been 500%.

Two cpus used x walltime of 41 hours = 82 hours cpu-time. That’s how we get the 82 hours of CPU time.