In case of HPC applications, openMP should be use instead of scalar encoding since nodes are multi-CPU and multi-core. When the problem size (time and/or space) cannot be supported by a single node you should move to MPI or hybrid MPI/openMP programming.
Home site: http://openmp.org/wp/
Tutorial LLNL: https://computing.llnl.gov/tutorials/openMP/
rpm -qa | grep gomp rpm -ql libgomp-4.4.0-6.el5.x86_64
OpenMP provides a portable, scalable model for developers of shared memory parallel applications with Threads.
The API supports C/C++ and Fortran.
#include <omp.h> main () { int var1, var2, var3; //Serial code executed by master thread #pragma omp parallel private(var1, var2) shared(var3) //openMP directive { // Parallel section executed by all threads } // Resume serial code executed by master thread }
OpenMP defines library routines useful to get and set the openMP envioronment. Examples:
omp_set_num_threads() omp_get_num_threads() omp_get_thread_num() omp_get_num_procs() ...
Main directives are:
Syntax:
#pragma omp <directive-name> [cluase, ..] { // parallelized region } //implicit synchronization
These clauses are used to specify additional information with the directive.
For example, private(i) is a clause to the parallel directive stating that each thread will have its own "i" variable.
int i; #pragma omp parallel private(i) ( i=rand(); .. }
Syntax
#pragma omp barrier //explicit synchronization
A parallel region is a block of code that will be executed by multiple threads. This is the fundamental OpenMP parallel construct.
The SECTIONS directive is a non-iterative work-sharing construct. It specifies that the enclosed section(s) of code are to be divided among the threads in the team.
By default MPI could be non thread safe.
The user can invoke the thread support using MPI_Init_thread (in place of MPI_Init). There are 4 support levels:
Sintax:
int MPI_Init_thread(int *argc, char ***argv, int required, int *provided)
"Required" is the level required by the user. "Provided" is the level provided by MPI.
Hybrid program
#include "mpi.h" int main(int argc, char **argv){ int rank, size, ierr, i; MPI_Init(&argc,&argv[]); MPI_Comm_rank (...,&rank); MPI_Comm_size (...,&size); #pragma omp parallel for for(i=0; i<n; i++) { printf("do some work\n"; } MPI_Finalize();