Strumenti Utente

Strumenti Sito


roberto.alfieri:user:reti:openmp

Parallel programming: why and when

In case of HPC applications, openMP should be use instead of scalar encoding since nodes are multi-CPU and multi-core. When the problem size (time and/or space) cannot be supported by a single node you should move to MPI or hybrid MPI/openMP programming.

openMP

Home site: http://openmp.org/wp/

Tutorial LLNL: https://computing.llnl.gov/tutorials/openMP/

Tutorial

rpm -qa | grep gomp
rpm -ql libgomp-4.4.0-6.el5.x86_64

Shared memory model

OpenMP provides a portable, scalable model for developers of shared memory parallel applications with Threads.

The API supports C/C++ and Fortran.

openMP Execution Model

  • Begin execution as a single process (master thread)
  • Start of a parallel construct (using special directives): Master thread creates team of threads
  • Fork-join model of parallel execution

Structure example in C
#include <omp.h>

main ()  {

int var1, var2, var3;

//Serial code executed by master thread

#pragma omp parallel private(var1, var2) shared(var3)  //openMP directive
  {
  //  Parallel section executed by all threads 
  }  

// Resume serial code executed by master thread

}

Library routines

OpenMP defines library routines useful to get and set the openMP envioronment. Examples:

 omp_set_num_threads()
 omp_get_num_threads()
… omp_get_thread_num()
 omp_get_num_procs()
 ...

openMP Directives

Main directives are:

Fork: PARALLEL FOR SECTION SINGLE TASK MASTER

Syntax:

  #pragma omp <directive-name> [cluase, ..]
  {
  // parallelized region
  }  //implicit synchronization

These clauses are used to specify additional information with the directive.

For example, private(i) is a clause to the parallel directive stating that each thread will have its own "i" variable.

int i;
#pragma omp parallel private(i)
( i=rand(); .. }
Join: BARRIER

Syntax

#pragma omp barrier //explicit synchronization
PARALLEL

A parallel region is a block of code that will be executed by multiple threads. This is the fundamental OpenMP parallel construct.

parallel1.c

parallel2.c

PARALLEL FOR

Parallelizes a loop, dividing the iterations among the threads

parallel_for.c

SECTIONS

The SECTIONS directive is a non-iterative work-sharing construct. It specifies that the enclosed section(s) of code are to be divided among the threads in the team.

parallel_sections.c

SINGLE
REDUCTION

Hybrid programming MPI/openMP

MPI + openMP

MPI thread safety

By default MPI could be non thread safe.

The user can invoke the thread support using MPI_Init_thread (in place of MPI_Init). There are 4 support levels:

  • MPI_THREAD_SINGLE : no support
  • MPI_THREAD_FUNNELED : master thread only call MPI (default)
  • MPI_THREAD_SERIALIZED: more then one thread can call MPI, but in turn.
  • MPI_THREAD_MULTIPLE: MPI thread safe

Sintax:

int MPI_Init_thread(int *argc, char ***argv, int required, int *provided)

"Required" is the level required by the user. "Provided" is the level provided by MPI.

Hybrid program

#include "mpi.h"
int main(int argc, char **argv){
int rank, size, ierr, i;
  MPI_Init(&argc,&argv[]);
  MPI_Comm_rank (...,&rank);
  MPI_Comm_size (...,&size);
  
#pragma omp parallel for
for(i=0; i<n; i++)
  {
  printf("do some work\n"; 
  }
MPI_Finalize();
roberto.alfieri/user/reti/openmp.txt · Ultima modifica: 30/08/2012 16:45 da roberto.alfieri