Apr 01 2022

Q&A: HPC Solutions Work for Agencies and Projects of All Sizes

Dell Technologies’ high-performance computing solutions serve customers who need assistance with Big Data jobs.

Dell Technologies has provided high-performance computing (HPC) solutions to customers since the late 1990s, working with large labs that run supercomputers as well as smaller organizations working on individual Big Data projects. Its HPC & AI Innovation Lab helps tailor HPC solutions to customers’ specific needs. Stephen Sofhauser, federal team lead for HPC at Dell Technologies, describes to FedTech magazine how HPC solutions can be valuable to customers and handle projects of any size.

FEDTECH:  How did Dell enter the HPC solutions business?

Sofhauser: We can trace our roots in high-performance computing with industry-standard solutions to the late 1990s, with academic clusters at Cornell University and the University at Buffalo. Our business soon grew, with notable systems like Tungsten at the National Center for Supercomputing Applications in 2003, Stampede at the Texas Advanced Computing Center (TACC) in 2013 and currently with the announcement of Dell Technologies being selected for the Department of Energy Tri-Labs contract.

Though the large, cutting-edge clusters are exciting for pushing the boundaries of HPC, we are just as excited to see the results our customers get with the smaller clusters that make up a lot of our HPC installations. It always makes my day when I hear a Dell cluster helped to develop a new treatment for childhood cancer, as happened at the Translational Genomics Research Institute, or as was used to narrow down treatments for COVID-19 at TACC during the pandemic.

FEDTECH:  What services does Dell provide for HPC?

Sofhauser: Dell has a Dell-badged installation team installing HPC systems in classified and nonclassified environments, taking a system from design to acceptance. Dell also has managed services to administer the cluster for the customer, both on-premises and remotely. We can work with our channel partners to provide the best of both to provide customers with the optimal experience for their HPC needs.

DISCOVER: Find out how to design the technology that supports your agency’s future needs.

FEDTECH: Are HPC and supercomputers synonymous?

Sofhauser: Well, technically, HPC industry analysts have defined the supercomputer segment of the market as any solution that costs $500,000 or more. (There are differing opinions about that definition, but it’s one that works.) What is interesting to me is that once you get over $1 million, the systems at a high level are using the same technologies for interconnects, compute nodes and storage, just more of them. The challenges at the higher system level come from power, cooling and system management. With higher wattage CPUs, GPUs and the complexities larger clusters can bring, liquid cooling has become necessary for some customers, and we expect that to grow in the future.

RELATED: How can high-performance computers help federal agencies? 

FEDTECH: Discuss some of the industry solutions that use HPC.

Sofhauser: Dell has offered engineering-validated systems for years developed by our HPC & AI Innovation Lab. These HPC systems are designed for specific workloads in computer-aided engineering, healthcare/life sciences, artificial intelligence/machine learning/deep learning, data analytics and other areas. The systems include testing the workloads specific to the industry. The cluster is tuned for the right processor, memory, interconnect design and storage for the specific workload or application. Benchmarks, guidance and more are available on our HPC & AI Innovation Lab website and are available for customers as they consider modifying or planning their next system. Our HPC solution architects use this information for customizing systems for customers when their needs are different from the engineering-validated design.

FEDTECH: What are the primary uses of HPC — is it all big science and Big Data, or can it be applied to smaller projects as well?

Sofhauser: HPC is no longer for just physics researchers using large clusters to explore the wonders of the universe. For years, Dell has been democratizing HPC — we call it “HPC for the masses” — by trying to make HPC easier to adopt for agencies, organizations and departments.

We created the engineering-validated designs to make designing and purchasing easier. Dell developed installation and knowledge transfer services, along with ProSupport for HPC, a phone based/online safety net for our HPC customers. It’s available when and where needed. We sell a lot of 4- to 32-node clusters to agencies for simulations, for use in economics and banking policy, etc.

People who are not familiar with the intricacies of HPC might not realize that even if a HPC cluster is, for example, 400 nodes, few jobs will use all 400 nodes. A lot of jobs only need a few nodes to run, so there are normally many jobs of all sizes running on an actively used HPC cluster.

Click the banner below to watch a video about Oak Ridge’s supercomputers.

FEDTECH: What does the HPC & AI Innovation Lab do?

Sofhauser: The HPC & AI Innovation Lab characterizes and optimizes HPC systems for various workloads. They study and test new hardware and software technologies in order to help our HPC solution architects and ultimately our customers design the best systems for their needs. The lab engineers share their knowledge through blogs and white papers and present at seminars.

I met a customer for the first time in New York City, just before COVID-19 hit, and I mentioned the lab to him. He smiled and said he had been reading white papers posted by the lab since 2008, when he worked overseas. The lab seems to have a loyal fan base that is growing.

FEDTECH: Do HPC solutions provide lessons or examples that can be used by agencies without supercomputers or HPC environments of their own?

Sofhauser: Yes, if you are running jobs on notebooks or workstations and they are taking days or weeks to complete, then you should investigate expanding HPC capabilities to provide more timely results. Even a small cluster or getting time on a larger HPC cluster can make all the difference. Using HPC allows you to ask deeper questions, consider more variables and get your answers faster in order to better support your stakeholders.

Brought to you by:

monsitj/Getty Images

aaa 1