Using HPC for Mission-Critical Workloads


Supercomputers once were large, monolithic, data-crunching machines generally associated with organizations that were similarly large and monolithic, such as NASA, the U.S. Army or the Los Alamos National Laboratory. Today, however, almost any size organization can achieve supercomputer-level performance utilizing off-the-shelf components and the latest system management tools.

Although the term “supercomputer” technically refers to an elite class of machines, high-performance computing (HPC) system are capable of delivering the processing power required by compute-intensive applications. Historically designed using proprietary technologies, the latest HPC solutions use x86 chips in highly scalable architectures based upon the Linux operating system. So-called “hard partitioning” techniques ensure maximum reliability and performance for mission-critical applications.

A combination of commodity hardware and open-source operating systems is making high-performance computing more accessible to businesses of all sizes. No longer just for calculation-intensive tasks such as weather forecasting, seismic analysis and fluid mechanics, affordable HPC systems now support a wide range of workloads, including financial modeling, R&D and big data. Organizations are using HPC systems to replace aging proprietary platforms, reduce server sprawl and gain business benefits such as high availability and improved manageability.

Scalability is the Achilles’ heel of low-end servers. When the processing capacity of these servers is reached, more servers must be added, increasing management headaches. Cost can quickly mount when you consider per-server software licenses, backup and recovery solutions and other necessary add-ons. Scalability is the hallmark of the latest HPC systems, which organizations to add processing power and memory to handle applications such as data analytics.

Although the initial capital cost of an HPC server tends to be higher, the TCO benefits of a scale-up environment become evident over the lifecycle of the system. Fewer systems also means less demand for resources such as power, cooling and network interconnects, further reducing costs and improving data center efficiency.

Improved reliability, availability and serviceability (RAS) has long been a benefit of scalable high-performance servers. Sophisticated failover capabilities help maintain data integrity and business continuity during planned and unplanned downtime.

Hard partitioning helps to boost reliability in HPC systems. It enables the system to be configured as one large server or several smaller ones, each with its own dedicated processor and memory resources and operating environment, effectively isolating mission-critical applications from other application failures and reboots.

Electrically isolated hard partitions enable near bare-metal performance and stronger application and OS fault tolerance while increasing overall system availability. Yet you still have the flexibility benefits of virtualization. Although each hard partition acts like an individual server, it is integrated into the overall environment. You can scale the physical resources of a partition through software even while it’s running.

The concept of supercomputing has changed over the years as servers based upon commodity components have become more powerful. The price/performance of HPC systems based upon x86 chips make them a compelling value proposition for organizations of all sizes. HPC systems give organizations the scalability, reliability and performance to support mission-critical applications in a flexible, cost-effective package.