Early Access, Design Changes, Implementation and Integration

Overview

RIS has a long history of managing specialized storage before joining WashU IT. Using the same underpinning technologies and lessons learned, this was generalized into the concept of a Storage Service that could be offered to a wider audience at the University. Over time, details of the resources that make up this offering have changed due to new policies, designs, discoveries and technologies. This can sometimes lead to or explain intentional or accidental changes in service behaviour or availability. This is a limitation of our development processes and technical details, a look at which may provide insight to some observed behaviors.

Components of a Storage Allocation

When the storage service is used, it employs a number of technical resources that all need to be coordinated together in order to function properly. Some of these may include:

  • A GPFS “fileset”

  • A GPFS fileset “link”

  • An SMB export

  • A “primary” storage group

  • A set of groups for each “project”

  • Additional filesets for projects or archives

  • Snapshots

  • Directories for all of the above

  • Access control lists for all of the directories

Each of these items have their own design and configuration and combine to implement controlled access to digital storage. These resources may have been created or modified by different processes or entities throughout the evolution of the service.

Changes to the Storage Service Over Time

Historical

GPFS, Disk groups in LIMS, storage0, identity management through IPA, NFS, private network

Early Access to Storage Service

GPFS, SMB integrated with ACCOUNTS Active Directory domain (WUSTL Key), WUSM network

It is known by the legacy gsc.wustl.edu

For some time it is dual-homed and known by both gsc.wustl.edu and ris.wustl.edu names, causing some headaches

Early Features

Active

Free 5 TB via SMB, hardquotas, billed for additional usage. POSIX mode 0000 , unseen because only used through SMB. Not browseable, can not see directories without ACL permission. “Bypass Traverse Check” in SMB:

> The Bypass Traverse Check is implemented in GPFS for SMB clients > only. Clients that use other protocols might still be locked out > because the parent tree of an export has more restrictive ACLs than > the export itself.

This feature is relied upon, preventing future access via compute1

Archive

Lower cost, stored to tape

Provision storage container/script

Where allocations previously been made “by hand”, this brings some much needed consistency to the provisioning process. However it has shortcomings with managing existing allocations. Many changes are still made by hand.

Projects

New RW/RO groups for every project, ACL or POSIX _only_

Filesets changed for ACL + POSIX perms

Setting POSIX perms no longer wipes ACL, special ACEs can represent and influence POSIX permissions

Add default posix permissions

Observe applications break with mode 0000 files when used in compute, e.g. git. Start using heritable default of 0700.

Using Ansible

Defining allocations in a declarative style suitable to be used by a “desired state” tool and run idempotently to manage the configuration of an allocation throughout its lifecycle. This development leads to the “ris.research-storage-allocation” Ansible role.

NFS turned on for engineering

A fact to be aware of.

Inherit different permissions for files and folders

Git is still not happy, files which should not be executable are made executable. Change to default 0700 for directories, but 0600 for files. Effectively a umask of 0077.

This still munges permissions in git, as it turns out any inherited ACE causes umask to be ignored. POSIX ACLs do this too and IBM has confirmed it is functioning as designed/desired.

The Storage Allocation Today

March 2020

RIS is developing new processes to manage all of this in a more specific and consistent manner. This includes a new interface, “ITSM”, and bringing all allocations up to our current design and standards. It is inevitable that changes and mistakes will occur, and having a record and method to refer to will ensure that such changes or mistakes are more easily caught and avoided in the future.

Recent work and changes may include:

  • Enumerating a definition for every allocation from existing resources

  • Changing fileset comments, which reflect service desk issues, department and fund numbers, contacts, etc.

  • Setting a soft quota to match the hard quota

  • Setting different defaults and inherited permissions for allocations and projects: different settings for directories (“executable”, or “traverse”) and files (not executable by default)

The most notable effect should be files no longer being executable by default. Most allocations required no other changes. This was performed as a necessary step to expanding the storage service to integrate with the compute service, as well as an improvement in consistency and design. During the process, a few recently changed settings may have been reversed:

  • New changes to quotas

  • New changes to member access or nested groups

When this was discovered, steps were taken to remedy and avoid it where possible. It is not expected for this to recur.