From SIMD Wrappers to SIMD Ranges (pt 1).

From SIMD Wrappers to SIMD Ranges (pt 1).

Joel Falcou, Denis Yaroshevskiy

This talk

tinyurl.com/jfdy2025pt1
SIMD ranges are coming
We want to knowledge share
We need to teach you simd first
Discussed with Matthias Kretz

No Magic Compiler

std::unseq is in progress

Let's explain some SIMD

memcmp / memchr / strlen
reduce
structs
min_element
copy_if

eve library

github
Joel Falcou, Jean-Thierry Lapresté, Alexis Aune, Denis Yaroshevskiy
eve::algo::mismatch

eve::algo::mismatch

eve::algo::mismatch

Vector Processor Extensions

x86

128 bits: SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2
256 bits: AVX, AVX2, XOP
512 bits: AVX512 and its myriad of sub-genre

ARM

128 bits: NEON, ASIMD
SVE (VLS/VLA)

RVV
PowerPC
WASM

What's inside mismatch?

find_if_not + zip

find

Strlen

allocations happen in pages
aligned addresses are safe

memcmp/memchr/strlen conclusions

Compilation target is important
Types are important
You can't just do "for each"
Overlapping registers tail handling
Aligned loads tail handling
Registers can contain garbage

memcmp/memchr/strlen acknowledgements

Stephen Canon (Stack Overflow)
glibc
std::mismatch - Nikolas Klauser

reduce

reduce to the same type
reduce to a different type

reduce conclusions

shuffles
different interfaces
mixing types

Structs

min_element

min_element(1)

std::reduce(rng, min)
reduce + find

min_element conclusions

index handling
writing a loop is difficult
"Not for each"

min_element acknowledgements

copy_if

copy_if conclusions

compress_*
differrent interfaces
"Not for each" (two loops inside)

copy_if acknowledgements

tiny lookup tables (@aqrit)
bmi2 (Peter Cordes)
switch + shuffle (@Z boson)
simd-scalar: Peter Cordes, Ilya Albrecht

Links