From SIMD Wrappers to SIMD Ranges (pt 1).

Joel Falcou, Denis Yaroshevskiy

This talk

  • tinyurl.com/jfdy2025pt1
  • SIMD ranges are coming
  • We want to knowledge share
  • We need to teach you simd first
  • Discussed with Matthias Kretz

No Magic Compiler

  • std::unseq is in progress

Let's explain some SIMD

  • memcmp / memchr / strlen
  • reduce
  • structs
  • min_element
  • copy_if

eve library

  • github
  • Joel Falcou, Jean-Thierry Lapresté, Alexis Aune, Denis Yaroshevskiy
  • eve::algo::mismatch

eve::algo::mismatch

eve::algo::mismatch

Vector Processor Extensions

  • x86
    • 128 bits: SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2
    • 256 bits: AVX, AVX2, XOP
    • 512 bits: AVX512 and its myriad of sub-genre
  • ARM
    • 128 bits: NEON, ASIMD
    • SVE (VLS/VLA)
  • RVV
  • PowerPC
  • WASM

What's inside mismatch?

  • find_if_not + zip

find

Strlen

  • allocations happen in pages
  • aligned addresses are safe

memcmp/memchr/strlen conclusions

  • Compilation target is important
  • Types are important
  • You can't just do "for each"
  • Overlapping registers tail handling
  • Aligned loads tail handling
  • Registers can contain garbage

memcmp/memchr/strlen acknowledgements

reduce

  • reduce to the same type
  • reduce to a different type

reduce conclusions

  • shuffles
  • different interfaces
  • mixing types

Structs

min_element

min_element(1)

  • std::reduce(rng, min)
  • reduce + find

min_element conclusions

  • index handling
  • writing a loop is difficult
  • "Not for each"

min_element acknowledgements

copy_if

copy_if conclusions

  • compress_*
  • differrent interfaces
  • "Not for each" (two loops inside)

copy_if acknowledgements

  • tiny lookup tables (@aqrit)
  • bmi2 (Peter Cordes)
  • switch + shuffle (@Z boson)
  • simd-scalar: Peter Cordes, Ilya Albrecht

Links