Distribution Sort: Exponential 1

Introduction: Skewed Data Distributions

Distribution sort performs best when the buckets have a similar number of elements assigned to each process rank. In this exercise, you will reuse your code in programming activity #1 to perform distribution sort on exponentially distributed data. Like Programming Activity #1, you will use equal width bins assigned to each process rank. We will study how performance changes due to using a skewed data distribution.

Programming Activity #2

Modify distribution_sort_exponential_starter.c and incorporate your code from Programming Activity #1 into this file. The only difference between distribution_sort_exponential_starter.c and distribution_sort_uniform_starter.c that you used in Programming Activity #1 is that the generated data uses an exponential distribution with $\lambda=4$ instead of a uniform distribution.

Complete the table below using the same instructions outlined in Programming Activity #1.

# of Ranks ($p$)	Total Time (s)	Time to Distribute (s)	Time to Sort (s)	Parallel Speedup	Parallel Efficiency	Global Sum	Job Script Name (*.sh)
1
2
4
8
12
16
20

Using the data you collected in your table, answer the following questions. Be sure to refer to specific values in your table when responding to these questions.

Q8: Does performance differ between the uniformly distributed (Programming Activity #1) and exponentially distributed data? If so, explain the performance discrepancy.

Last updated on Jan 8, 2020