Checking GPU Usage

You can obtain a lot of useful information on the NVIDA GPU that we have installed on the GPU enabled nodes using NVIDIA’s “System Management Interface” program nvidia-smi. Look at its man page for details man nvidia-smi.

Here is an example of its output:

GPUNode $ 
$ nvidia-smi 

Tue Aug 11 09:46:51 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.51.05    Driver Version: 450.51.05    CUDA Version: 11.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  Off  | 00000000:3B:00.0 Off |                    0 |
| N/A   61C    P0   155W / 250W |  17289MiB / 32510MiB |     91%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla V100-PCIE...  Off  | 00000000:D8:00.0 Off |                    0 |
| N/A   62C    P0   152W / 250W |  17289MiB / 32510MiB |     89%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A     57573      C   python                          17283MiB |
|    1   N/A  N/A     57573      C   python                          17283MiB |
+-----------------------------------------------------------------------------+
$ 

You can see that this node has two Tesla V100 GPUs installed. Both are running at about 50% memory usage and 90% GPU utilisation.

This is custom footer