Skip to content

kalyanac/lustre_iostats

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

lustre_iostats

Job IO statistics on Lustre file systems for HPC Clusters

A quick and dirty utility to get raw job IO statistics on Lustre file systems. This utility is written in Python.

Known issue On Lustre client 2.5.1, I noticed the read_bytes & write_bytes is now reporting samples + 4 values instead of 3 values. The script must be corrected to read the last but one column instead of last column.

How to use:

Refer to the sample job script on how to use this utility.

External Dependencies:

Site Customizations:

In iostats_client.py, update the following to reflect the file system names at your site

MDCPath = "/proc/fs/lustre/llite/snx1100*"

OSCPath = "/proc/fs/lustre/osc/snx1100*"

The server picks a port to listen to based on the job id. The job id format I worked with is server.jobid. The jobid portion is numeric. If your job ID's are not numeric, customize the Port variable in iostats_client.py & iostats_server.py to match your needs. Using jobid is convenient as there will never be a conflict when multiple jobs are using this utility.

Issues

Error handling is not extensively tested. Espeically if permissions restrict the script from reading the necessary files in /proc, there maybe issues with the server waiting forever.

Sample output

+---------------+-------------+--------------+
| File System   | File Read   | File Write   |
+===============+=============+==============+
| BW Home       | 320.0 MB    | 0.0 bytes    |
+---------------+-------------+--------------+
| BW Scratch    | 32.0 GB     | 32.0 GB      |
+---------------+-------------+--------------+
| BW Projects   | 0.0 bytes   | 0.0 bytes    |
+---------------+-------------+--------------+

Metadata statistics for this job:
+---------------+--------+---------+----------+-------------+---------+-----------+---------+
| File System   |   open |   close |   create |        seek |   fsync |   getattr |   mkdir |
+===============+========+=========+==========+=============+=========+===========+=========+
| BW Home       |    160 |     160 |        0 | 0           |       0 |       448 |       0 |
+---------------+--------+---------+----------+-------------+---------+-----------+---------+
| BW Scratch    |    256 |     256 |        0 | 1.67772e+07 |       0 |       256 |       0 |
+---------------+--------+---------+----------+-------------+---------+-----------+---------+
| BW Projects   |      0 |       0 |        0 | 0           |       0 |         0 |       0 |
+---------------+--------+---------+----------+-------------+---------+-----------+---------+

This work was developed while at NCSA. Please provide attribution as needed.

About

Job IO statistics on Lustre file systems for HPC Clusters

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages