Data-Intensive High Performance Computing Pedagogic Modules

About

These open source pedagogic modules are developed as part of the NSF project entitled CRII: OAC: A Framework for Parallel Data-Intensive Computing on Emerging Architectures and Astroinformatics Applications (PI: Gowanlock, NSF Grant No. 1849559).

These modules are used in Mike Gowanlock’s high performance computing class at Northern Arizona University.

Overview

These pedagogic modules teach high performance computing (HPC) using data-intensive computing to unearth key concepts studied in typical HPC courses. The data-intensive lens allows students to understand real-world scenarios that arise when working with data. Some examples contained in these pedagogic modules include:

  • Algorithm performance that varies as a function of data distribution.
  • Load imbalance that arises due to skewed data distributions.
  • Memory-bound applications that may benefit from scaling out rather than scaling up to take advantage of more memory bandwidth rather than CPU cores.

These pedagogic modules are targeted towards small clusters, such that students can obtain resources shortly after submitting their job to a job queue. Many of the key concepts in the modules can be discovered using a typical workstation computer with roughly 24 physical cores; however, the modules typically exceed the capacity of a laptop computer. In most cases, the input sizes and computational intensity can be increased to scale on more nodes/cores than presented in these modules.

Audience

Computer scientists and computational/domain scientists can benefit from these modules.

Questions

Please e-mail me if:

  • You have any questions.
  • You are an instructor wanting to use these modules, and would like solutions to the problems.
  • You have detected bugs in the modules or material that should be clarified.

Download the Source

If you would like to download the webpages of these modules so that you can modify the content and add your own modules, see the links below:

  • The static webpages can be found here.
  • This webpage has been developed using Hugo. The Hugo source files can be found here.

Acknowledgments

This material is based upon work supported by the National Science Foundation under Grant No. 1849559.