CSCI 3500: Studio 13
OpenMP
We have been studying Pthreads as the low-level interface to threading
provided by the operating system. However, most parallel programs are not
written in Pthreads, which is considered to be a low-level interface.
Instead, a parallel concurrency platform is used, which implement
threading for the programmer and provide a high-level interface for
parallel programming.
In this studio, you will:
- Write a program to determine whether the first 20,000,000 numbers are prime or not
- Parallelize this program with OpenMP
Please complete the required exercises below, as well as any optional
enrichment exercises that you wish to complete.
As you work through these exercises, please record your answers in a text
file. When finished, submit your work via the Git repository.
Make sure that the name of each person who worked on these exercises
is listed in the first answer, and make sure you number each of your responses
so it is easy to match your responses with each exercise.
Required Exercises
- As the answer to the first exercise, list the names of the people who
worked together on this studio.
- Recall that a prime number is any number that is only divisible by
itself and one. Write a function that takes a single integer argument and determines
whether that number is prime. Return 1 if the number is prime, and return 0
if the number is not prime. The best way to do this is simple brute force-
use the modulus operator to test possible divisors. If no number divides the
candidate, then it's prime.
Test your function on some prime numbers such as
7, 23, 101, or 982451653. Copy and paste your results.
- That last number is a useful test case.
We want to test a _lot_ of numbers for primality, so we
want to ensure that we can do this very efficiently.
On
hopper.slu.edu
the instructor's code only
takes 0.005 seconds to test that particular number (and his code is not even
particularly efficient). We want your code to be in roughly the same
class. A few points:
- You only need to test odd numbers
- You only need to test numbers up to the square root of your candidate
- If you aren't already, use a
for
loop to check all possible divisors: for( i = 3; i*i <= candidate; i+=2 )
Measure your program with the time
command and copy-paste the results. It should be
relatively efficient.
- Now, modify your program so that it tests all numbers from 1 to N for primality.
Your program should print out, in order, all prime numbers less than N. Copy and paste
your program output for N equal to 100. Look up a list of prime numbers on the internet
and double-check your results.
- Now we want to think about parallelizing your program. To do so we need to identify
some logically independent operations that may occur in parallel. In the previous exercise
you tested all numbers from 0 to 100 for primality- testing each number
is logically independent. Why is that?
- The operations needed to test an individual number for primality are not
entirely independent. For example, if we wanted to test the number 105 for primality
then, using the square root of 105 as an upper bound,
we would have to see whether it was divisible by any of the set:
{2, 3, 5, 7, 9}
. Why is testing 7 and 9 not independent of testing the
number 5?
- Make a copy of your program so that you can have a sequential and parallel
version of your code. You can convert a
for
loop into a parallel-for
loop by inserting the following statment immediately before the loop:
#pragma omp parallel for
You will need to include omp.h
and you must add the statement "-fopenmp
"
to the command line. Now test your program again for N equal to 100. Make sure that the output of the
parallelized version matches the output of your sequential version. Does it?
- Go ahead and comment out your print statements in both programs once you are convinced
that your parallel program is correct. We want to time each program and the time
required to print out your results will vastly outweigh the time required to do the computations.
Time each program for N equal to one million (1000000). This should go quickly but
still be enough work to see definite parallel improvement. Paste your results.
Time each program for N equal to twenty million (20000000). Paste your results.
- Compare the effort required to use OpenMP to the effort used with Pthreads.
Give the steps required to approach this problem with Pthreads.
Optional Enrichment Exercises
- No optional exercises