Hello SIMD World
TIP
You can do this exercise locally or on Compiler Explorer (unless you want to use vir-simd).
Boilerplate:
#include <experimental/simd>
#include <iostream>
namespace stdx = std::experimental;
template <class T, class A>
std::ostream& operator<<(std::ostream& s, const stdx::simd<T, A>& v) {
s << '[' << v[0];
for (std::size_t i = 1; i < v.size(); ++i) {
s << ", " << v[i];
}
return s << ']';
}
int main() {
return 0;
}
simd
constructorsTest the four different constructors:
- default
- broadcast
- generator
- load
… using different element types:
double
,char
,unsigned
, …Check what happens if you use a non-vectorizable type.
Test different ABI tags (
simd<T>
,native_simd<T>
,fixed_size_simd<T, N>
,simd<T, simd_abi::scalar>
).
Do an aligned load on an unaligned address.
TIP
You just learned a new reason for SIGSEGV. Remember this when your future self stares at the debugger, puzzled how the pointer can be out-of-bounds…
Implement
abs(simd<T, A>)
Implement and test the absolute value function (not using
stdx::abs
):template <class T, class A> simd<T, A> abs(simd<T, A> x) { // TODO }
Note that a correct
std::abs
implementation cares about-0.
. Bonus points if you have an idea…
TIP
linear search
Given a
std::string_view
(which is a contiguous range ofchar
s),
- … count the number of spaces.
- … return the index of the first occurrence of a given char.
- … (optional) return the index of the first occurrence of a given substring.
int count_spaces(std::string_view s) { // TODO } int find(std::string_view s, char c) { // TODO } int find(std::string_view s, std::string_view s) { // TODO }
TIP
Optional 1:
simd_for_each
(A fully general solution of this exercise is part of vir-simd.)
Write a
simd_for_each
algorithm that takes a range and a generic callable:template <std::contiguous_range R> void simd_for_each(R&& rng, auto&& fun) { // Load simd's from std::ranges::data(rng) and invoke fun with each. // Consider how and when to write back a modified simd. // don't forget the epilogue }
TIP
For a completely generic solution you might want to use:
Optional 2: Generalize
simd_for_each
Use
vir::simdize
to generalize yoursimd_for_each
from vectorizable range value types to “simdizable” range value types. I.e. makesimd
iteration over array of struct/std::tuple
easy to use.This example should work:
struct Point { float x, y, z; }; void normalize(std::vector<Point>& data) { simd_for_each(data, [](auto& v) { auto& [x, y, z] = v; const auto scale = 1.f / sqrt(x * x + y * y + z * z); x *= scale; y *= scale; z *= scale; }); }
Bonus: Optimize memory access
Optimize loads and stores: from scalar access to vector access.