Differenze

Queste sono le differenze tra la revisione selezionata e la versione attuale della pagina.

--- roberto.alfieri:pub:vectorization [12/06/2017 16:37]
roberto.alfieri
+++ roberto.alfieri:pub:vectorization [13/06/2017 20:15]
roberto.alfieri
@@ Linea 1: / Linea 1: @@
 ====== Vectorization ======
+[[ https://hpc-forge.cineca.it/files/CoursesDev/public/2016/Milan/Enabling_software_for_high_scalable_intel_arch/course_part1.pdf | Introduction to Intel scalable architectures ]]
+Processor peak performance includes the speed-up provided by the vector instructions,
+but in order  to exploit it you need  specific programming techniques.
+{{:roberto.alfieri:pub:vectortrend.png?200|}}
+{{:roberto.alfieri:pub:waytovectorize.png?200|}}
+Auto-vectorization is the easiest and more portable way to get vectorization.
+The compiler recognize several vectiorization options.
+Main vectorization options:
+^     ^  Intel compiler   ^
+^ KNL      | -xMIC-AVX512  |
+^ BDW      | -xCORE-AVX2   |
+^ Disable  | -no-vec       |
+Not all loops can be vectorized:
+Some examples:
+  * Loop with dependencies between iterations
+<code>
+for (i=1; i<MAX; i++) {
+   d[i] = e[i] – a[i-1];
+   a[i] = b[i] + c[i];
+}
+</code>
+  * Complex loops
+  * Function calls inside the loop:
+      for (int i = 0; i < N; i++)   a[i] = foo(b[i]);
+  * Loops on data that are not aligned in memory

UNIVERSITÀ DI PARMADIPARTIMENTO DI SCIENZE MATEMATICHE, FISICHE E INFORMATICHE