From SIMD Wrappers to SIMD Ranges (pt 1).

Joel Falcou, Denis Yaroshevskiy

This talk

  • TODO: link to pt1 slides
  • SIMD ranges are coming
  • We want to knowledge share
  • We need to teach you simd first

Let's do some algorithms

  • memcmp
  • memchr
  • strlen
  • reduce
  • inclusive_scan
  • min_element
  • copy_if
  • reverse

Notes

  • std::mismatch - Nikolas Klauser

eve library

  • github
  • Joel Falcou, Jean-Thierry Lapresté, Alexis Aune, Denis Yaroshevskiy
  • eve::algo::mismatch

Vector Processor Extensions

  • x86
    • 128 bits: SSE2, SSE3, SSSE3, SSE4, SSE4.1, SSE4.2
    • 256 bits: AVX, AVX2, XOP
    • 512 bits: AVX512 and its myriad of sub-genre
  • ARM
    • 128 bits: NEON, ASIMD
    • SVE (VLS/VLA)
  • RVV
  • PowerPC
  • WASM

What's inside mismatch?

  • find_if_not + zip

find

Strlen

  • allocations happen in pages
  • aligned addresses are safe

Strlen (new functions)

  • load[unsafe](ptr)
  • first_true[ignore](test)

memcmp/memchr/strlen conclusions

  • Compilation target is important
  • Types are important
  • You can't just do "for each"
  • Overlapping registers tail handling
  • Aligned loads tail handling
  • Registers can contain garbage
  • Stephen Canon (Stack Overflow)
  • glibc

reduce

  • reduce to the same type
  • reduce to a different type

reduce same type (operations)

  • replace_ignored(x, ignore, with)
  • reduce(w, op)

reduce (different type)

  • wide<T, fixed<N>>
  • convert(w, as)