From SIMD Wrappers to SIMD Ranges (pt 1).
Joel Falcou, Denis Yaroshevskiy
This talk
- TODO: link to pt1 slides
- SIMD ranges are coming
- We want to knowledge share
- We need to teach you simd first
Let's do some algorithms
- memcmp
- memchr
- strlen
- reduce
- inclusive_scan
- min_element
- copy_if
- reverse
Notes
- std::mismatch - Nikolas Klauser
eve library
-
github
- Joel Falcou, Jean-Thierry Lapresté, Alexis
Aune,
Denis
Yaroshevskiy
- eve::algo::mismatch
Vector Processor Extensions
- x86
- 128 bits: SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2
- 256 bits: AVX, AVX2, XOP
- 512 bits: AVX512 and its myriad of sub-genre
- ARM
- 128 bits: NEON, ASIMD
- SVE (VLS/VLA)
- RVV
- PowerPC
- WASM
Strlen
- allocations happen in pages
- aligned addresses are safe
Strlen (new functions)
- load[unsafe](ptr)
- first_true[ignore](test)
memcmp/memchr/strlen conclusions
- Compilation target is important
- Types are important
- You can't just do "for each"
- Overlapping registers tail
handling
- Aligned loads tail handling
- Registers can contain garbage
-
Stephen Canon (Stack Overflow)
- glibc
reduce
- reduce to the same type
- reduce to a different type
reduce same type (operations)
- replace_ignored(x, ignore, with)
- reduce(w, op)
reduce (different type)
- wide<T, fixed<N>>
- convert(w, as)