CSCI 3500: Studio 14
OpenMP Configuration
In this studio, you will:
Configure the way that OpenMP executes a parallel program
Please complete the required exercises below, as well as any optional
enrichment exercises that you wish to complete.
As you work through these exercises, please record your answers in a text
file. When finished, submit your work via the Git repository.
Make sure that the name of each person who worked on these exercises
is listed in the first answer, and make sure you number each of your responses
so it is easy to match your responses with each exercise.
Required Exercises
- As the answer to the first exercise, list the names of the people who
worked together on this studio.
- Create a parallel-for loop using the previous studio as a template.
We are interested in how this loop is actually partitioned across different
threads. To help us with this, OpenMP assigns its threads a unique identifying
number, and provides functions to access this.
For each iteration of your loop you should print out the loop index and the
currently executing thread in a single
printf()
statement.
Use the function omp_get_thread_num()
to access the currently
executing thread's number. Copy and paste your program output.
Note- the OpenMP functions are not included in Linux's man pages, but you
can access a reference sheet at the following link. Be sure to include
omp.h
and compile with -fopenmp
http://www.openmp.org/mp-documents/OpenMP-4.5-1115-CPP-web.pdf
- You might wonder just how many threads OpenMP will make. You can query
this as well. Print out the maximum number of threads OpenMP will use with the
function
omp_get_max_threads()
. Print out your result.
- The maximum number of threads on Hopper is great for high performance,
but that many threads will become confusing quickly. Set the maximum number
of threads OpenMP should use to five (5) with the function
omp_set_num_threads()
, and then re-run your loop from the first
exercise. Copy and paste your results.
- You might wonder how fairly OpenMP schedules its work.
Set your loop to have 25 iterations and re-run your loop. How many iterations
does each thread handle?
- Think about a situation in which it would be undesirable or unfair for
OpenMP to assign the same number of loop iterations to each thread. When might
that be a bad idea?
- Let's simulate a bad situation for OpenMP by making the first five loop
iterations take much longer than the others. Use the Linux
sleep()
function to cause the first five loop iterations to sleep for one second. That is,
inside your loop insert the statement: if ( index <= 4 ) sleep(1);
.
What do you think will happen? If the work is split evenly across five threads
then the program should take about one second. However, if the first five loop
iterations are allocated to a single thread, then the program will take about
five seconds.
Run your program and confirm or deny your hypothesis. Time your program with
the time
command.
- The above behavior is caused by OpenMP's default scheduling policy,
which statically assigns work to each thread. This means that work
is assigned at compile time, which can be very efficient, but it is not possible
to adapt the behavior of the system after that point.
However, OpenMP supports many different scheduling strategies.
Configure your system to dynamically assign work by modifying your
parallel for loop declaration:
#pragma omp parallel for schedule( dynamic, 1 )
How long does your program take now?
- What else has changed about how your program is scheduled?
- The second parameter to the
schedule()
modifier is
called the chunk size. Modify this value and observe the effects. What
do you think the chunk size specifies?
Optional Enrichment Exercises
- No optional exercises