Strumenti Utente

Strumenti Sito


roberto.alfieri:pub:mpigrid

Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

Link a questa pagina di confronto

Entrambe le parti precedenti la revisioneRevisione precedente
Prossima revisione
Revisione precedente
roberto.alfieri:pub:mpigrid [25/02/2010 21:19] roberto.alfieriroberto.alfieri:pub:mpigrid [26/02/2010 12:10] (versione attuale) roberto.alfieri
Linea 1: Linea 1:
 +
 +
 +
 +
 +
 +
 +=====  MPI/theophys Project ====
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +==== WorkerNodes ====
 +
 +== Software environment ==
 +
 +   * SL5.x x86_64
 +   * openMPI >= 1.3 (1.4 would be better) 
 +   * MPICH2
 +   * Gnu C, C++, Fortran (gfortran, g77 , g90??
 +   * Support for commercial compilers?
 +   * Scientific libraries: 
 +      * openMP: multithread library
 +      * HDF5  : Data storing and managing library.
 +      * Blas : Basic Linear Algebra Subprograms
 +      * lapack: Linear Algebra PACKage
 +      * GSL : Gnu Scientific Library
 +      * GMP:  Gnu Multiprecision Library
 +      * GLPK: Gnu Linear Programming Kit.
 +      * Fftw3: Fast Fourier Transform Library
 +      * Octave : high-level language for numerical computations
 +
 +==Installation example==
 +<code>
 + yum install -y yum-conf-epel.noarch
 + yum install -y octave hdf5-devel glpk fftw3
 + yum install -y libgomp blas-devel gsl-devel gmp-devel
 +</code>
 +
 +== theophys TAG==
 +The following is a possible TAG to be published by theophys compliant sites:
 +  GlueHostApplicationSoftwareRunTimeEnvironment: VO-theophys-gcc41 ??
 +
 +
 +
 +
 +
 +
 +
 +
 +====  Cluster ====
 +
 +== Published TAGs for MPI ==
 +Mpi-start is the way to start MPI jobs:
 +
 +   MPI-START
 +
 +Al least openMPI should be installed:
 +
 +   MPI_OPENMPI
 +   MPI_OPENMPI_VERSION="x.y.z"
 +
 +Shared home is recommended, but file distribution is supported by MPI-start:
 +
 +   MPI_SHARED_HOME | MPI_NO_SHARED_HOME  
 +
 +Remote Start-up of MPI job can be achieved via password-less SSH:
 +
 +  MPI_SSH_HOST_BASED_AUTH
 +
 +Infiniband is recommended, but Gbit (or 10Gb) Ethernet can be used:
 +
 +    MPI-Infiniband | MPI-Ethernet 
 +
 +
 +==Open Issues==
 +
 +  * Is it possible to publish the actual number  of free CPUs per queue? 
 +
 +  * How is CpuNumber used in the match-making process? 
 +<code>
 +At the moment  CpuNumber is not used at all for match making.
 +Temporary solution in the JDL:
 +CPUNumber=n
 +other.GlueCEInfoTotalCPUs >= CPUNumber
 +</code>
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +==== JDL ====
 +
 +==Typical parallel jdl== 
 +
 +  JobType = "Normal" ;
 +  CpuNumber = 8 ;
 +
 +==multithread support ==
 +
 +  SMPGranularity = 8;
 +  WholeNodes = True;
 + 
 +Multithread support is desirable and it should be integrated
 + in the middleware as soon as possible.
 +
 +== Open Issues ==
 +
 +  * Is it possible to integrate Granularity/WholeNodes directly in InfnGrid? 
 +<code>
 +CREAM and BLAH: see https://twiki.cern.ch/twiki/bin/view/EGEE/ParameterPassing ??
 +WMS: included in WMS 3.3
 +</code>
 +
 +
 +
 + 
 +
 +
 +==== Parallel and sequential jobs ====
 +
 +VOMS Roles can be used to limit the access to Parallel queues.
 +
 +== Voms Role = "Parallel" ==
 +The [[https://voms.cnaf.infn.it:8443/voms/infngrid/SearchRoles.do | Role]] is assigned by the VO manager 
 +and released by [[https://voms.cnaf.infn.it:8443/voms/theophys/Siblings.do |VOMS]] only on explicit request.
 +
 +==Setup example==
 +<code>
 +site-info.def:
 +PARALLEL_GROUP_ENABLE="/infngrid/ROLE=parallel"
 +
 +/opt/glite/yaim/defaults/ig-site.pre:
 +FQANVOVIEWS=yes
 +
 +groups.conf:
 +"/infngrid/ROLE=parallel":::: 
 +
 +voms-proxy-init -voms infngrid:/infngrid/Role=parallel 
 +voms-proxy-info -all
 +>....
 +>attribute : /infngrid/Role=parallel/Capability=NULL
 +>attribute : /infngrid/Role=NULL/Capability=NULL 
 +>...
 +</code>
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +
 +==== MPI multi-thread jobs ====
 +
 +MPI and multi-thread programs can be combined to exploit 
 +the upcoming multicore architectures.
 +The hybrid mutlithread/MPI programming leads to  a request
 +of N CPUs with a smaller number of MPI processes (N/thread_num).
 +Actually this programming model is not supported in EGEE.
 +Possible solution: 
 +modify  the value type of WholeNodes from boolean to integer.
 +Example:
 +  SMPGranularity = 8;
 +  WholeNodes = 4;
 +This syntax would lead to 
 +  qsub -l nodes=4:ppn=GlueHostArchitectureSMPSize
 +where ppn is a number >=8.
 +WholeNodes value should be passed to mpi-start
 +as the number of MPI processes.
 +Mpi-start should be  modified accordingly.
 +
 +Mixed mpi/mutithread programs require thread safe
 + MPI implementations.
 +Thead safety can be easily  verified:
 +
 +<code>
 +MPI_Init_thread(&argc, &argv, 3, &prov); 
 +printf("MPI_Init_thread provided:%d\n", prov);
 +</code>
 +The third parameter (number 3)  means a request 
 +of full thread safety support ( MPI_THREAD_MULTIPLE ).
 +If returned value for prov is 0  thread support is not provided 
 +(MPI_THREAD_SINGLE).
 +
 +
 +==== Scheduling ====
 +
 +==objectives==
 +  * Minimize jobs starvation
 +  * Maximize resources exploitation
 +
 +==Possible scenario==
 +MPI sites with at least 2 queues sharing the same pool of WNs:
 +  * **high priority parallel queue**
 +      * accessible only with special Role (Role=parallel ?)
 +  *  **Low priority sequential queue**
 +      * preemptable (renice or requeue ?)
 +      * short WallClockTime (less than 6 hours?)
 +      * accessible only with special Role (Role=short ?).
 +
 +
 +
 +
 +==== Revision history ====
 +
 +  * 20100225 - R. DePietri,  F. DiRenzo -  User's required libraries
 +  * 20100210 - C. Aiftimiei, R.Alfieri, M.Bencivenni, T.Ferrari - First Version
  

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki