Aug 03 2021
US nuclear weapon stockpile health checker Los Alamos Labs is looking to see if computational storage can speed up simulation modelling runs on its supercomputers.
Los Alamos Labs also works in the energy, environment, infrastructure, health, and global security areas, doing strategic work for the US government. It operates in the ultrascale high-performance computing environment and has set up an Efficient Mission-Centric Computing Consortium (EMC3) to develop more efficient computing architectures, system components, and environments for its mix of workload. Computational storage developer NGD Systems is an EMC3 member, along with nearly 20 other organisations.
The oft-quoted Gary Grider, HPC division leader at Los Alamos, provided a statement about this: “Los Alamos is happy to see the evolution of computational offloads towards standards-based computational storage technology, and is hopeful explorations into use cases for this technology will bear fruit for the HPC and at-scale computing environments.”
NGD CTO Vladimir Alves explained Los Alamos’s interest: “NGD’s … computational storage platform makes it easy to try new concepts for offloading functions to near storage.”
What role does near storage play here? Brad Settlemyer, a senior scientist in Los Alamos’s HPC Design Group, explained: “Computational storage devices become a key source of acceleration when we are able to directly interpret the data within the storage device. With that component in place, near-storage analytics unleashes massive speedups via in-device reduction and fewer roundtrips between the device and host processors.”
NGD 12TB Newport ruler drive.
NGD builds its Newport line of NVMe flash storage drives, with up to 64TB of capacity, using on-board ASICs and Arm cores to process data stored in the drive. This means that Los Alamos HPC processors could have some of their work — repetitive processing of stored data such as transcoding — offloaded to computational storage drives, and so enable more compute cores working on the overall problem. It’s all about accelerating HPC application runs.
We are told computational offloads using both in-network processing and near-storage compute are becoming an important part of both scale-up and scale-out computing, with future scaling requirements virtually requiring programmable elements along the data path to achieve performance efficiency goals.
Alves said: “By offering an OS-based storage device, with on-board applications processors in our NVMe SSD ASIC solution, we offer partners like Los Alamos the ability to try many different paths to a more complete solution with a simple and seamless programming and device management model.”
As a by-product of the EMC3 work, Los Alamos has partnered with NGD Systems to build a curriculum for a set of summer internship programs in using NGD’s Newport Computational Storage Drives to accelerate data analytics.
You can’t view NGD Systems Newport drives as drop-in components in a computing data path. They have to be programmed — and that means an application’s code or a host server OS has to recognise and manage where processing is being carried out. That said, the potential benefits of having, say, a hundred Newport drives pre-process hundreds of terabytes of data before it is sent to expensive and dedicated HPC system cores for further processing can be huge. This would be particularly so with repetitive HPC runs, where the drive-level processors load pre-written code.
This would be equivalent to loading factory production-line robot welding machines with new programs to build car bodies instead of having a single central computer look after all the robots. The robots do the grunt work themselves, offloading the central system.