CONTACTS
|
Workshops - Wednesday, October 13
Delta Force Exascale: Runtime and Tools Requirements for the Programming Models of the Future
Organizing Committee:
Richard Barrett, Sandia National Laboratories
Ben Bergen, Los Alamos National Laboratory
So far, most exascale discussions on programming models have focused on how to make exascale computing more accessible to domain scientists. However, when these systems initially come online, the user group will be limited to a fairly small number of elite applications developers working on very agile codes. These efforts are of interest to the larger HPC community because they will help to establish the tools and techniques that will be used to tame the exascale landscape as computing at this scale becomes more ubiquitous. This workshop will focus on defining the basic runtime and tools requirements for these first-wave invaders, as well as the demarcation zone between systems-level and application responsibilities for handling issues like fault tolerance, scheduling and communication.
7:30 – 8:30 |
Breakfast |
8:30 - 9:05 |
Welcome and Introduction, Ben Bergen |
9:05 - 9:40 |
“Programming Models,” Mike Houston |
9:40 – 10:00 |
Discussion |
10:00 - 10:30 |
Coffee Break |
10:30 - 11:05 |
“Hardware Trends,” Peter Hofstee |
11:05 - 11:40 |
“Hardware for Future Software Needs,” Zach Baker |
11:40 – 12:00 |
Discussion |
12:00 - 1:30 |
Lunch Break (on your own) |
1:30 - 2:05 |
“Runtime Systems,” Ron Brightwell |
2:05 - 2:40 |
“Runtime Systems,” Jean-Marie Verdune |
2:40 – 3:00 |
Discussion |
3:00 - 3:30 |
Coffee Break |
3:30 – 4:05 |
“Exascale Tools,” Samir Shende |
4:05 – 4:40 |
“Exascale Tools,” David Montoya |
4:40 – 5:10 |
Discussion and Recap |
Exascale Co-Design for Materials in Extremes
Organizing Committee:
Tim Germann, Los Alamos National Laboratory
Jim Belak, Lawrence Livermore National Laboratory
Sriram Swaminarayan, Los Alamos National Laboratory
Scott Futral, Lawrence Livermore National Laboratory
Exascale computing presents an enormous opportunity for solving some of today’s most pressing problems, including producing clean energy, extending nuclear reactor lifetime, and nuclear stockpile aging. At their core, each of these problems requires prediction of material response to
extreme environments. The purpose of this workshop is to discuss the role of co-design in establishing the inter-relationship between software and hardware required for materials simulation at the exascale. In particular, we will discuss the research components needed to create a multiphysics exascale simulation framework for modeling materials subjected to extreme mechanical and radiation environments. The ultimate goal is to develop a UQ-driven adaptive physics refinement method in which coarse-scale simulations spawn sub-scale direct numerical simulations as needed. This task-based approach leverages the extensive concurrency and heterogeneity expected at exascale while enabling fault tolerance within applications. A key step in the co-design process is the creation benchmark codes that stress all aspects of the exascale design. This 1/2-day workshop will bring together participants with expertise in aspects of computer science, applied math, and computational materials science required to achieve this goal.
Topics may include:
- Computational co-design
- Current and future programming models
- Domain specific languages
- Scale-bridging techniques
- Uncertainty quantification methodologies and concepts
- Scalable tools, including visualization
- Performance modeling and simulation
- Vendor interaction
7:30 – 8:30 |
Breakfast |
8:30 |
Welcome & Introduction |
8:35 |
“Overview and vision for the Exascale Co-Design Center,” Tim Germann (LANL) |
8:50 |
Using Domain-Specific Languages to Enable Innovative Hardware and Software, Pat Hanrahran (Stanford) – |
9:20 |
“CoOperative Parallelism programming model,” David Jefferson (LLNL) |
9:40 |
“Novel Algorithms in Computational Materials Science: Enabling Adaptive Sampling”, Nathan Barton (LLNL) |
10:00 |
Coffee Break |
10:30 |
“Related task parallelism programming model,” Paul Henning (LANL) |
10:50 |
“Structural Simulation Toolkit (SST),” Jim Ang/Arun Rodrigues (SNL) |
11:10 |
“Performance modeling and analysis,” Philip Roth (ORNL) |
11:20 |
“Emerging Architectures,” Kyle Spafford (ORNL) |
11:30 |
“Exascale Data Analysis and Visualization for the Multi-scale Materials Co-design Center,” Jim Ahrens (LANL) |
12:00 – 1:30 |
Lunch Break (on your own) |
1:30 – 3:00 |
Exascale Co-design Center organizational meeting (invitation only) |
3:00 |
Coffee Break |
3:30 – 6:00 |
Exascale Co-design Center organizational meeting (invitation only) |
Hardware Trends and SW/HW Co-Tuning Opportunities
http://lph.ece.utexas.edu/merez/LACSS2010/HWSWWorkshop
Organizing Committee:
Mattan Erez, University of Texas at Austin
Reaching exascale will involve significant changes to the underlying system components, such as the processor, memory, and interconnect. These changes include new technology as well as continued advances and improved efficiency of known techniques. For example emerging non-volatile memory can potentially be integrated as part of the main memory system rather than just used as solid-state disks and integrated optical interconnect can significantly change communication tradeoffs. At the same time new opportunities are emerging for improving the efficiency of the processor architecture itself and both on-chip and off-chip electrical links. In this workshop we will focus on trends in hardware components, projections on future capabilities and constraints, and the implications on applications. We will also discuss opportunities for co-tuning software and hardware and the potential for new paradigms. The goals of the workshop are to present predictions on where hardware is heading and identify potential problems and opportunities resulting from this new technology on applications and software.
7:30 – 8:30 |
Breakfast |
8:15 - 8:30 |
Welcome and Introduction, Mattan Erez |
8:30 - 9:05 |
Single-Chip Heterogeneous Computing: Does the Future Include Custom Logic, FPGAs, and GPGPUs?, James C, Hoe |
9:05 - 9:40 |
From GPU Computing to Exascale: Technology Trends, Brucek Khailany |
9:40 – 10:00 |
Minipanel: Processor Trends |
10:00 - 10:30 |
Coffee Break |
10:30 - 11:05 |
Sustainable Silicon: Energy-Efficient VLSI Interconnects, Patrick Chiang |
11:05 - 11:40 |
Optical Interconnects for Exascale Systems, Moray McLaren |
11:40 – 12:00 |
Minipanel: Interconnect Trends |
12:00 - 1:30 |
Lunch Break (on your own) |
1:30 - 2:05 |
Low-power/Low-voltage Computing, Shih-Lien Lu |
2:05 - 2:40 |
Processors have evolved, why haven't main memories?, Al Davis |
2:40 – 3:00 |
Minipanel: On- and Off-Chip Memories |
3:00 - 3:30 |
Coffee Break |
3:30 – 4:00 |
Quick Recap, Mattan Erez |
Resilience Summit 2010
http://www.csm.ornl.gov/srt/conferences/ResilienceSummit/2010/
Workshop general co-chairs:
Stephen L. Scott
Computer Science and Mathematics Division
Oak Ridge National Laboratory
Chokchai (Box) Leangsuksun
eXtreme Computing Research Group
Louisiana Tech University
Program co-chairs:
Christian Engelmann
Computer Science and Mathematics Division
Oak Ridge National Laboratory
Program committee:
Sean Blanchard, Los Alamos National Laboratory
Jim Brandt, Sandia National Laboratories, USA
Greg Bronevetsky, Lawrence Livermore National Laboratory
Franck Cappello, UIUC-INRIA Joint Laboratory on PetaScale Computing
Nathan DeBardeleben, Advanced Computing Systems Program, DoD
Ann Gentile, Sandia National Laboratories
Recent trends in high-performance computing (HPC) systems have clearly indicated that future increases in performance, in excess of those resulting from improvements in single-processor performance, will be achieved through corresponding increases in system scale, i.e., using a significantly larger component count. As the raw computational performance of the world's fastest HPC systems increases from today's current peta-scale to next-generation exa-scale capability and beyond, their number of computational, networking, and storage components will grow from the ten-to-one-hundred thousand compute nodes of today's systems to several hundreds of thousands of compute nodes and more in the foreseeable future. This substantial growth in system scale, and the resulting component count, poses a challenge for HPC system and application software with respect to fault tolerance and resilience.
Furthermore, recent experiences on extreme-scale HPC systems with non-recoverable soft errors, i.e., bit flips in memory, cache, registers, and logic added another major source of concern. The probability of such errors not only grows with system size, but also with increasing architectural vulnerability caused by employing accelerators, such as FPGAs and GPUs, and by shrinking nanometer technology. Reactive fault tolerance technologies, such as checkpoint/restart, are unable to handle high failure rates due to associated overheads, while proactive resilience technologies, such as preemptive migration, simply fail as random soft errors can't be predicted. Moreover, soft errors may even remain undetected resulting in silent data corruption.
The goal of the Resilience for Exascale HPC is to bring together experts in the area of fault tolerance and resilience for high-performance computing from national laboratories and universities to present their achievements and to discuss the challenges ahead. The secondary goal is to raise awareness in the HPC community about existing solutions, ongoing and planned work, and future research and development needs. The workshop program consists of a series of invited talks by experts and a round table discussion.
7:30 – 8:30 |
Breakfast |
8:30 – 10:00 |
Welcome and Introduction
Stephen L. Scott, Oak Ridge National Laboratory, USA
"Hard Data on Soft Errors: A Global-Scale Assessment of GPGPU Memory Soft Error Rates", Imran Haque, Stanford University, USA
"Soft Errors, Silent Data Corruption, and Exascale Computing", Sarah E. Michalak, Los Alamos National Laboratory, USA |
10:00 - 10:30 |
Coffee Break |
10:30 – 12:00 |
"Scalable HPC System Monitoring", Christian Engelmann, Oak Ridge National Laboratory, USA
"Scalable HPC Monitoring and Analysis for Understanding and Automated Response", Jim Brandt, Sandia National Laboratories, USA
"Mining Event Log patterns in HPC Systems", Ana Gainaru, University of Illinois at Urbana-Champaign, USA |
12:00 – 1:30 |
Lunch Break (on your own) |
1:30 – 3:00 |
"Integrating Fault Tolerance into the Monte Carlo Application Toolkit", Rob Aulwes, Los Alamos National Laboratory, USA
"HPC Rejuvenation and GPGPU Checkpoint Model", Chokchai (Box) Leangsuksun, Louisiana Tech University, USA
"An Uncoordinated Checkpoint Protocol for Send-deterministic HPC Application", Amina Guermouche, INRIA, France |
3:00 – 3:30 |
Coffee Break |
3:30 – 5:00 |
"VolpexMPI: Robust Execution of MPI Applications through Process Replication", Edgar Gabriel, University of Houston, USA
Discussion: "The Future of HPC Resilience - Research Challenges and Opportunities", Stephen L. Scott, Oak Ridge National Laboratory, USA, and Chokchai (Box) Leangsuksun, Louisiana Tech University, USA
Closing, Stephen L. Scott, Oak Ridge National Laboratory, USA |

|
Places to Visit in New Mexico

|