====== Vectorization ====== [[ http://www.training.prace-ri.eu/uploads/tx_pracetmo/intel_mic_optimization.pdf | Vectorization and Code Optimization ]] Processor peak performance includes the speed-up provided by the vector instructions, but in order to exploit it you need specific programming techniques. {{:roberto.alfieri:pub:vectortrend.png?200|}} {{:roberto.alfieri:pub:waytovectorize.png?200|}} Auto-vectorization is the easiest and more portable way to get vectorization. The compiler recognize several vectiorization options. Main vectorization options: ^ ^ Intel compiler ^ ^ KNL | -xMIC-AVX512 | ^ BDW | -xCORE-AVX2 | ^ Disable | -no-vec | Not all loops can be vectorized: Some examples: * Loop with dependencies between iterations for (i=1; i * Complex loops * Function calls inside the loop: for (int i = 0; i < N; i++) a[i] = foo(b[i]); * Loops on data that are not aligned in memory