Research Computing Summer 2017 Update

Research Computing Support Services continues to expand grow.  In August researchers did 2,932,217 core-hours of computation on the Lewis cluster up from 592,358 core-hours a year before.

In April we added five new nodes and integrated the remainder of the existing NSF MRI Research Cluster  grant (NSF Award 1429294) into the Lewis cluster.  This grant is ending after 3 years and the resources (around 70 nodes) are now a part of the Lewis cluster.  This NSF grant was instrumental in the development of the Lewis envronment as it is today and we are grateful to the NSF and the researchers in engineering that received the award.

The cluster now has over 5000 modern cores with an additional 400 older cores for interactive and debugging use.  There are also now 10 GPU nodes (NVidia K20m or K40) from the MRI RC grant and we have five more GPU’s (Nvidia P100 and four Nvidia GTX 1080 Ti’s) that have been added to the cluster.

HPC Investment in the cluster continues to grow and we added 8 more investor nodes at the end of the summer.  We now have a number of groups that have invested multiple times.  They have been able to get started with a small investment and invest in additional capacity when they need it.  Given the agility of our environment we have been able to make smaller purchases in-between larger investments for them.  This has been made possible by a special relationship with Dell to get continuous pricing for our HPC investors allowing us to grow when and how we need to.  This has the added benefit that we are now able to  coordinate larger purchases by taking care of smaller immediate needs.  We will be combining an increase to the community pool (available to all researchers at MU/UM) with a number of larger investments this fall for even deeper discounts.

Currently the cluster capacity is comprised of 20% individual investors, a 13% investment by the Office of Research and Mizzou Advantage specifically for Bioinformatics computations (BioCompute partition), 42% in an Engineering MRI grant [1] (NSF Award 1429294), and 25% by the Division of IT for the campus research computing community.  In addition we have the equivalent of about 8% additional capacity (400 additional cores) of older cores for interactive jobs not included in previous count.

A recent example of a researcher immediately being able to utilize their investment demonstrates the power of investing.  The researcher invested in 5 nodes and after migrating their workload onto the cluster they were able to run, in the first 41 days, 137,000 core hours on 512 of their 140 invested cores utilizing 125% of their investment during this period.

On July 1st we finished transitioning Research Network (RNet) connections in the building off old and slow equipment (some of which was over 10 years old) to the campus infrastructure. This allows us to provide RNet to all researchers across campus.  In addition, we built a new General Purpose Research Network (GPRN) on campus that is a special zone that allows connectivity to and from research computing resources at high speed (Lewis for example) and at the same time provides greater protection from internet attacks.  Research Network ports are available on any network port across campus and are charged the standard port fee.  Experimental, dedicated, or high bandwidth capabilities on the Research Network are still available through partnerships with researchers, just reach out to our team if you are interested.

Improvement to the Research Network continues with the transition of our 100Gbps Internet2 AL2S connection that was part of a NSF CC*NIE grant [2] (NSF Award 1245795) to a dedicated connection to the Great Plains Network (GPN) in Kansas City.  This connection provides 100Gps Layer 2 connectivity to other GPN institutions and to other Internet2 AL2S sites.  We are working on bringing IPv6 Internet2 connectivity (Layer 3) to this circuit as well.  We are also upgrading our core cluster switch to 100Gbps which, in conjunction with the Layer 3 upgrades, will allow the cluster to connect to other universities at 100Gbps (Internet2 IPv6 sites) with a theoretical transfer rate of around 10GB/s.

The recently formed ShowMeCI.org group for “Sharing Cyberinfrastructure information, education, and resources across the Show Me State” has agreed to form a State Research Platform modeled after the Pacific and National Research Platorm (PRP and NRP) (http://prp.ucsd.edu/).  The goal of this effort is to enable researcher-to-researcher connectivity that spans the state and the region through a network of people enabled by standardized instrumentation and data transfer tools so that problems can be diagnosed and fixed.  This effort is driven by researchers doing research across institutions with large data transfer needs, so if you have this need please contact us.

Finally, on the horizon is the replacement of the Lewis cluster storage with a high a speed parallel filesystem (zLustre) supported by a 100Gbps core switch upgrade.  This will be followed by another expansion of the Lewis cluster through HPC investment and nodes supported by RCSS.

For more information on the future of Cyberinfrastructure at the University of Missouri please attend CI-Day 2017 (http://doit.missouri.edu/ci).

## References

  1. http://munews.missouri.edu/news-releases/2014/0926-nsf-grants-1-million-to-mu-to-expand-supercomputer-equipment-and-expertise-for-big-data-analytics-at-mu/
  2. http://engineering.missouri.edu/2013/01/mu-researcher-secures-nsf-grant/

Sep. 27, 2017