University of Missouri Division of Information Technology (DoIT)
Revision 1.2.6, 2015-06-04
Computer Science faculty, in collaboration with the Division of IT, have inspired investments by the National Science Foundation which have provided 100GB network and some much needed high performance computing equipment and support. The remaining priority need is for research data storage. In 2014, under the guidance of MU’s Cyberinfrastructure (CI) Council, the Division of IT purchased over 1Petabyte of storage to initialize service offerings for General Purpose Research Storage. The plan is for additional funding for the ever-growing need for research data storage to be gathered through sponsored research, and investments by campus and UM-System.
The resulting General Purpose Research Storage (GPRS) offers a sustainable and flexible research data storage environment – providing research collaboration and computation storage resources at minimal cost. GPRS will support MU faculty and their students and collaborators in the sharing, analysis and visualization of research data. Campus computers and equipment (microscopes, etc.) can be connected to large pools of shared network storage through either Windows (SMB) or Linux (NFS). Integration with the campus identity management system enables fine-grain access to support the sophisticated collaboration needs of researchers. GPRS is intended for research analysis rather than archiving and is not automatically backed-up. Although the system is highly resilient to component failure, backup services are available for a fee.
In keeping with the tradition of funding research computing at MU, the plan is to accumulate necessary resources in partnership with MU’s researchers and their funding agencies. The intention is to offer a reasonable volume of storage at an affordable cost, and gather measures of usage and demand to enable and justify increased funding. The Division of IT’s Research Computing staff will interconnect and manage the data storage and will actively recruit and support additional investments by researchers and other units on campus including MU Libraries.
Research data storage can be allocated to an individual researcher or to a research lab or group of collaborators – graduate students must be sponsored by a faculty member as part of a group. Large blocks of data storage will be available for five years and will be professionally managed and accessible via dedicated network access. For an additional cost, offsite replication for backup will also be available. Research Computing staff will assist researchers with their data management plan as well as the cost and justification for grant proposals.
Fifty terabytes of storage have been designated for special cases where the storage will be provided at a reduced cost or no-charge. These projects are meant to be short exploratory projects or areas with special needs evaluated on a case by case basis to evaluate the need (size and duration). Special Projects might include promising exploratory research for short periods of time, exemplary research that has little chance of external funding, and other special use cases as administered by the CI Council. Requests for Special Projects storage will be reviewed by the CI-Council. Special Projects must submit a data management plan including Data Classification Level information, and report each semester on the storage usage and other relevant metrics. Requests should describe storage needs and the length of time it is needed as well a brief description of the research significance and expected outputs.
|Type||Description||Costs for FY15 & FY16|
|Individual Researcher Storage||Researchers will be provided with 10 GB of private storage for private use. Additional storage will be allocated in 256 GB increments. All storage used for sponsored research is considered Research Project Storage.||10 GB at no charge. $10 per TB per month for additional storage|
|Research Lab or Project Storage||Storage associated with specific groups of researchers (collaboration) or instruments are allocated a on a per project basis – 10 GB at no charge. Additional data storage will be allocated in 256 GB increments. Permissions will apply uniformly to the entire project.||One project and 10 GB at no charge. $10 per TB per month for additional storage|
|Large Storage Investor||Researchers with larger storage needs can purchase half or multiple entire nodes (around 100 TB per node) for 5 years of service. Investors receive dedicated access to one (half node) or two (full node) 10GE network storage ports. Nodes will be taken out of service after 5 years, but as part of a future storage offering, may be placed in a degraded storage system.||Contact Research Computing for details, and assistance with external funding requests if applicable.|
Storage Use Policy
- To assure affordability, the data stored in the GPRS will not be backed up. However, for an additional cost, backup services are available including remote replication.
- Only research data is to be stored. Administrative data, educational records and personal data is prohibited.
- The Data Classification Level (DCL) of the data level must be declared and the policy adhered to. Only DCL 1 and 2 are permissible on GPRS and accessible for use on Lewis. GPRS storage is available to the Lewis cluster via login and Data Transfer Nodes (DTN) but not available on compute nodes.
- Direct connected systems that violate university IT policy or affect stability will be immediately remediated.
- Data storage rates are quoted in useable storage allocated (quota) for a period of time, snapshots count as used storage.
- Storage will be initially allocated by Research Support Computing staff and researchers will be supported directly with IT Pros who in turn will be supported by Research Support Computing staff.
- Storage used for sponsored research is considered to be Researcher Project Storage and must adhere to the agencies’ (NSF, DOE, etc.) data management policy.
- Failure to comply with the storage and other relevant university policies, or failure to pay may result in removal of access after 90 days and deletion of the data after 120 days.
Service Level Agreement
- The General Purpose Research Storage will not have the same level of service provided to ‘enterprise services’ such as email. Researchers will be supported during regular business hours except for emergency cases (security, stability, and data integrity issues). Support will be primarily through the IT Pro, backed up and supported by Research Support Computing staff.
- The storage is highly resilient to component failure; however, backup services are available at an additional cost and are strongly encouraged.
- Client OS access and support are limited to operating systems supported by the Division of IT. Data will be accessible via the NFS and SMB protocols on campus and remotely over VPN granted on a case by case basis. GPRS storage is available to the Lewis cluster via login and Data Transfer Nodes (DTN) but not available on compute nodes.
- The Division of IT cannot guarantee service levels under circumstances beyond our control, including the following. When services are limited users will be notified via email and postings on the DoIT Research Computing website:
- Natural or man-made disasters
- Electrical or network outages or disruptions
- Equipment failure unrelated to abuse or neglect
- Attacks on the network
The following metrics will be collected to support the general purpose research storage:
- Data storage used.
- Collaborations supported.
- Publications and other research outcomes.
- External exchange of data (data transfer metrics).
This policy will be periodically reviewed by DoIT’s Research Computing Support Services, and MU’s Cyberinfrastructure Council and will be updated and revised as necessary. Revisions will be communicated to users and will be posted to the DoIT Research Computing website.
Prepared by MU’s Cyberinfrastructure Council Committee on Data Management:
- Timothy Middelkoop, Division of IT
- Mike Watson, Arts & Science
- Ernest Shaw, MU Libraries
- Ann Riley, MU Libraries
- Nathan Bivens, DNA Core Facility
- Bob Schnabel, Animal Science