Meeting national security science challenges with reliable computing
As part of the National Strategic Computing Initiative (NSCI), the Exascale Computing Project (ECP)was established to develop a capable exascale ecosystem, encompassing applications, system software, hardware technologies and architectures, and workforce development to meet the scientific and national security mission needs of the U.S. Department of Energy (DOE) in the mid-2020s time frame.
The goal of ECP is to deliver breakthrough modeling and simulation solutions that analyze more data in less time, providing insights and answers to the most critical U.S. challenges in scientific discovery, energy assurance, economic competitiveness and national security.
What is exascale computing?
Exascale computing refers to computing systems capable of at least one exaflop or a billion billion calculations per second (1018). That is 50 times faster than the most powerful supercomputers being used today and represents a thousand-fold increase over the first petascale computer that came into operation in 2008. How we use these large-scale simulation resources is the key to solving some of today’s most pressing problems, including clean energy production, nuclear reactor lifetime extension and nuclear stockpile aging.
The Los Alamos role
In the run-up to developing exascale systems, at Los Alamos we will be taking the lead on a co-design center, the Co-Design Center for Particle-Based Methods: From Quantum to Classical, Molecular to Cosmological. The ultimate goal is the creation of scalable open exascale software platforms suitable for use by a variety of particle-based simulations.
Los Alamos is leading the Exascale Atomistic capability for Accuracy, Length and Time (EXAALT) application development project. EXAALT will develop a molecular dynamics simulation platform that will fully utilize the power of exascale. The platform will allow users to choose the point in accuracy, length or time-space that is most appropriate for the problem at hand, trading the cost of one over another. The EXAALT project will be powerful enough to address a wide range of materials problems. For example, during its development, EXAALT will examine the degradation of UO2 fission fuel and plasma damage in tungsten under fusion first-wall conditions.
In addition, Los Alamos and partnering organizations will be involved in key software development proposals that cover many components of the software stack for exascale systems, including programming models and runtime libraries, mathematical libraries and frameworks, tools, lower-level system software, data management and I/O, as well as in situ visualization and data analysis.
Building a capable exascale ecosystem
The NSCI framework defines computing for the next several decades and includes:
- Pursuing specific exascale computing platforms
- Using exascale and large-scale simulation well
- Imagining what comes next
A collaboration of partners
ECP is a collaborative effort of two DOE organizations—the Office of Science and the National Nuclear Security Administration (NNSA). DOE formalized this long-term strategic effort under the guidance of key leaders from six DOE and NNSA National Laboratories: Argonne, Lawrence Berkeley, Lawrence Livermore, Los Alamos, Oak Ridge and Sandia. The ECP leads the formalized project management and integration processes that bridge and align the resources of the DOE and NNSA laboratories, allowing them to work with industry more effectively.
Led by Galen Shipman of the Computer, Computational, and Statistical Sciences Division and Professor Alex Aiken of Stanford University
Usable exascale systems will require significant advances in the programming environment to effectively manage billion way concurrency and optimize data movement in a complex memory and storage hierarchy, while providing performance portability across different system architectures. The Legion programming system is well positioned to address these challenges. In this project, we will build upon the Legion open source programming system, delivering new features required by exascale applications, integrating with other elements of the exascale software stack and co-designing the system with hardware vendors.
Led by James Ahrens of the Computer, Computational, and Statistical Sciences Division
Thise project will deliver algorithms and infrastructure suitable for the visualization and analysis needs of exascale applications. Many high performance simulation codes are currently using post hoc processing, meaning they write data to storage and then analyze it afterwards. Given exascale storage constraints, in situ processing will be necessary. In situ data visualization and analysis, selects, reduces and generates extracts from scientific results during simulation runs to overcome bandwidth and storage bottlenecks. Our capability will leverage our existing, successful open source visualization software packages, ParaView and VisIt. Lawerence Berkeley National Laboratory, Lawrence Livermore National Laboratory, and Kitware Incorporated will participate in the project.
Led by Michael Lang of the Computer, Computational, and Statistical Sciences Division
As new technology is adopted by high performance computing (HPC) computing hardware developers to address the challenges of eExascale, a large amount of focus (and as a result, innovation), has landed on memory sub-systems. Many new devices are available and emerging in the near term creating that create a diverse and complicated programming environment for developers that who wish to use this technology. The goal of this ECP project is to provide a common simplified runtime and application interface to these many devices. This work will also have broad applicability not only to eExascale and HPC applications but more generally to Linux- based software development. This project is a collaboration between Los Alamos National Laboratory, and Oak Ridge National Laboratory, Lawrence Livermore National Laboratory, Sandia National Laboratories and Georgia Tech.