std::inclusive_scan comparison

In a good case - intel is better for 40 bytes, amd for 10'000 bytes.
But nothing very interesting
AMD is more sensitive to code alignment but within 20%
Which is ~OK for code alignment. (put pudding: 'mimnax' to see).

std::inclusive_scan (40 bytes)

std::inclusive_scan(1000 bytes)

std::inclusive_scan(10'000 bytes)