Skip to content

Mountpoints

Mountpoints

HPC Vega uses and integrates technologies for data storage with different characteristics, which offer a variety of different file systems. These provide SLING and EuroHPC users with storage capacity and predefined directories.

Name Description Type
exa5 High Performance Storage Lustre
ceph Large Capacity Storage Ceph

High performance storage - Lustre

HPC Vega uses Lustre as the parallel distributed file system. Lustre is designed for efficient parallel I/O for large files. Therefore, we recommend the following to the users:

  • Avoid saving a large number of files in a single directory, better to split in more directories.

  • Avoid accessing a large number of small files on Lustre.

  • Make sure that the stripe count for small files is 1.

  • Avoid using standard Linux commands rm and ls -l.

  • Instead of ls, use lfs find. You can find examples of use below:

Search the directory tree

The lfs find command searches the directory tree rooted at the given directory / filename for files that match the specified parameters. To review a list of all the options you can use with lfs find, execute lfs find help or man lfs.

Note that it is usually more efficient to use lfs find rather than use find when searching for files on Lustre.

Some of the most commonly used lfs find options are:

Parameter Description
–atime File was last accessed N*24 hours ago. (There is no guarantee that atime is kept coherent across the cluster.)
–mtime File status was last modified N*24 hours ago.
–ctime File status was last changed N*24 hours ago.
–maxdepth Limits find to descend at most N levels of the directory tree.
–print /–print0 Prints the full filename, followed by a new line or NULL character correspondingly.
–size File has a size in bytes or kilo-, mega-, giga-, tera-, peta-, or exabytes if a suffix is given.
–type File has the type (block, character, directory, pipe, file, symlink, socket or Door [Solaris]).
–gid File has a specific group ID.
–group File belongs to a specific group (numeric group ID allowed).
–uid File has a specific numeric user ID.
–user File is owned by a specific user (numeric user ID is allowed).
$ lfs find /exa5/scratch/user/ -mtime +20 -type f -print

lfs find is used to identify files older than 20 days.

For more options, please review the main page for the lfs utility.

Check disk space usage

The lfs df command displays the file system disk space usage. Additional parameters can be specified to display inode usage of each MDT/OST or a subset of OSTs. The usage for the lfs df command is:

lfs df [-i] [-h] [path]

where:

-i: Lists inode usage per OST and MDT.

-h: Output is printed in human-readable format, using SI base-2 suffixes for mega-, giga-, tera-, peta-, or exabytes.

By default, the usage of all mounted Lustre file systems is displayed. Otherwise, if a path is specified, the usage of the specified file system is displayed.

  • Instead of rm, use munlink, a Lustre-specific command that will simply delete a file. Below is an example of our recommended approach which consists of two steps.

The first step deletes all the files and symbolic links within a directory (and its subdirectories) with the use of unlink (or munlink):

lfs find  ./mydir -print0 -type f -o -type l | xargs -0 unlink

Here is an overview of each step in that command:

  • find is a command that will search the indicated directory (and subdirectories within). The syntax defines a search for files and symbolic links.

  • -P is an option that restricts the search within the indicated directory tree and forces NO dereference of symbolic links. This ensures that the find command will not look for files within the links.

  • ./mydir is the argument of the directory where the search starts.

  • print0 is the option that indicates the format of the "find" command result. This particular format is able to find unknown file names and ensures that they are readable for the following command (xargs) which has been concatenated with the pipe.

  • type f -o -type l are options that indicate that the find command will search for anything that is a file (-type f) or (-o) a symbolic link (-type l, this is the lower letter l).

  • The Pipe command (represented by a single pipe line: |) concatenates two commands. This makes the output of the previous command "find" to serve as input to the following command (xargs in this case).

  • xargs -0 xargs will then convert the received list of files, line by line, into an argument for whatever command is specified at the end (in this case: munlink). The -0 flag is related to the format of the listed files; if you use -print0 in the find command you must use -0 in the xargs command.

  • unlink deletes each file and symbolic link in the list without overloading the metadata server. In this case, the list is the one received by xargs.

To list all files in your directory use command:

lfs find . -type f -print

The second step is to remove the empty directories and subdirectories in the tree. Once all of the files and symbolic links are deleted, you can remove the empty directories with a similar command:

find -P ./mydir -type d -empty -delete

File systems

The following file systems are available for users:

Area Path Description
tmp /tmp Local temporary directory on compute nodes for all users
home /ceph/hpc/home/$USER Home directory for users on a shared file system mounted on compute and login nodes
scratch /exa5/scratch/user/$USER Scratch directory on the High Performance Storage – Lustre – on request mounted on compute and login nodes
scratch /ceph/hpc/scratch/user/$USER Scratch directory on the Large Capacity Storage – Ceph – on request mounted on compute and login nodes
Local scratch /scratch/slurm/jobid Temporary node-local directory for jobs, could be exported as variable $WORKDIR/ (created automatically by Slurm prolog script and deleted after the job is finished)

Home Directory is user's home directory with a default quota of 100 GB . All home directories are on a shared file system and provide internal snapshots (backup) which are stored for 30 days.

Scratch Directories:

On the High Performance Storage – Lustre /exa5/scratch/user/

On the Large Capacity Storage – Ceph /ceph/hpc/scratch/user/

These directories are created upon a user request and are recommend for storing large amounts of non-persistent data. Files in these file systems are removed after one month, so please copy any data you want to keep to your home directory. Quotas are set in the scratch directory. Detailed information for quotas is available at this link.

Users should check their disk space quota by using the following commands:

For the usage on Lustre file system:

lfs quota -h -u $USER /exa5/scratch/user/$USER/

For the usage on Ceph file system:

User home directory:

getfattr --absolute-names -n ceph.dir.rbytes /ceph/hpc/home/$USER | tr -dc '0-9' | awk '{print $1/1024^3 " GB "}'

Shared project directory (replace <directory_name>):

 getfattr --absolute-names -n ceph.dir.rbytes /ceph/hpc/data/<directory_name> | tr -dc '0-9' | awk '{print $1/1024^3 " GB "}'

If you need permanent storage of large amounts of data, please contact the support team: support@sling.si

Quotas

Quotas Capacity Description
home 100GB Size of home directory for each user
scratch 20GB Size of scratched directory for each user

Increasing Quotas

For additional space, please contact the support team: support@sling.si