Data Storage
The following is a list of the storage options available on Hazel. For more details, see the video: Hazel: Navigating the different types of HPC storage
To move data to and from the cluster, see File Transfer.
Storage options include:
- Basic user storage: home and scratch directories
- Supplemental group storage: OIT Research Storage, storage for user maintained software, and mass storage for instructional use
- Additional resources: NC State University Libraries Data Management Services
Basic user storage
The main types of storage available to the general user are the home directory and the scratch directory. Many users will only use these spaces.Home directory
The home directory is intended for scripts, small application source code, and executables. There is not enough space in the home directory for large data files.
Each account has 15 GB and 10K files of quota in /home/user_name.
This space is backed up daily. One copy of each file is retained in the backup. Deleted files are retained for about 7 days. Files not associated with a valid Unity ID will be archived.
To check how much space is available in the home directory, type
quota_display.
If you are having trouble with your home directory at or over quota, try following these steps.
Scratch directory
Scratch space is intended for the storage requirements for running jobs. Applications should use scratch space during job execution, i.e., jobs should be submitted from /share.
Scratch space is not backed up and files that have not been accessed for 30 days are automatically deleted.
Each project has 20 TB and 1M files of quota in /share/group_name. Users should create a subdirectory under this location for their use:
cd /share/group_name mkdir user_nameTo find the group_name, see the first group listed in output from the
groups command. This should replace group_name.
To check how much space is being used in a directory, type du -h -d 0. Depending on the amount of files, this may take a long time.
To check overall /share quota usage use command quota_display
This space is not backed up. Files that have not been accessed in 30 days are automatically deleted.
Supplemental group storage
Supplemental storage is available for data and applications.Directory for user maintained software
HPC provides space for user installed software. The space is backed up daily. If a /usr/local/usrapps/group_name directory does not exist, see the requirements for requesting the space on the HPC software page.
Directories in /usr/local/usrapps may not be used for data or as a working space from which to execute jobs. A compute node cannot write to /usr/local/usrapps. Globus and HPC-VCL cannot write to this space either.
There is a default quota of 100GB and 250K files for each HPC Project at /usr/local/usrapps/group_name. To check the quota, issue the command:
quota_display
Research Storage
Each NC State researcher may obtain an allocation of research storage to use during their employment at NC State University. Also, PIs may obtain research storage allocation for awarded grants.
- OIT Research Storage documentation
- Please contact help@ncsu.edu for questions and assistance.
Archival Policy
Files in home not associated with a valid Unity ID will be archived to tape and then deleted after 1 year.- The validation of Unity IDs for all HPC users is performed once a semester following census day.
- If a Unity ID is no longer valid, i.e., the user has left the university, the archiving will be performed on their directories
/home/user_name - If the invalid Unity ID belongs to a PI, the archiving will be performed on their directories only if there are no active accounts remaining under their project(s).
Additional Resources
Consulting and data management plans are available.Libraries: Storage and Data Management Resources
The NC State University Libraries offers help during all phases of the research data lifecycle, including preparation of data management plans (DMPs) for grant proposals, consulting on best practices for storage, organization, and preservation, and helping to optimize sharing and discovery of data.