sslotin's blog

By sslotin, 3 years ago, In English

Hi everyone! I'm writing a book on performance engineering, and a few days ago, I finished a draft of one of its main crown jewels: the SIMD programming chapter.

The main findings that are published:

  • You can compute array sums and other reductions such as the minimum 2x faster than std::accumulate or an auto-vectorized loop would.
  • You can sometimes copy and assign memory 2x faster than memcpy and memset respectively.
  • You can search array elements 10x times faster than std::find or manually.
  • You can count a value in an array 1.5x faster than std::count or manually.
  • You can calculate population counts of large vectors ~2x faster than with the intrinsic.
  • You can filter arrays 6-7x faster, which translates to e. g. a 6-7x faster quicksort (to be published).
  • You can calculate the index of the minimum element ~10x faster than std::min_element or manually.
  • UPD: you can calculate the prefix sum of an array ~2.5x faster than std::partial_sum or manually.

All speedup numbers are architecture-specific and may be different (usually larger) on CodeForces servers.

Enjoy — and as always, I'm happy to hear any comments and suggestions.

  • Vote: I like it
  • +503
  • Vote: I do not like it

| Write comment?
»
3 years ago, # |
  Vote: I like it +28 Vote: I do not like it

Super cool.

»
3 years ago, # |
  Vote: I like it +23 Vote: I do not like it

fascinating. If you have a library for optimized common algorithms, can you share it?

  • »
    »
    3 years ago, # ^ |
    Rev. 2   Vote: I like it +25 Vote: I do not like it

    I don't have a library, but I have a (somewhat disorganized) repo with complete implementations of all the algorithms described in the book. Eventually turning it into a proper C/C++ library is definitely worthwhile, and this way it will also be more likely to be merged into the mainstream STL implementations.

»
2 years ago, # |
  Vote: I like it +44 Vote: I do not like it

An update: I will be speaking at a small online conference called Performance Summit tomorrow. Still figuring out the best way to create a YouTube live stream of my session, but once I do, the link will be in the "streams" section (you can also join via Zoom to ask quesitons).

The talk is called "The Art of SIMD Programming", and it covers pretty much everything in this chapter and a bit more. Come by if you prefer talks over books.

  • »
    »
    2 years ago, # ^ |
    Rev. 2   Vote: I like it +22 Vote: I do not like it

    Just noticed that the stream was removed from the streams section, and I can't seem to find any zoom link either. Is there a way to watch the stream somehow?

    • »
      »
      »
      2 years ago, # ^ |
      Rev. 2   Vote: I like it +17 Vote: I do not like it

      Yeah, sorry, I screwed up with the YT stream — it turns out you can't properly run Zoom webinars and OBS at the same time on Linux. But the organizers promised to give me the recording right after the conference finishes, so I will post it here in a few hours.

      • »
        »
        »
        »
        2 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        Waiting :) !

        • »
          »
          »
          »
          »
          2 years ago, # ^ |
            Vote: I like it +8 Vote: I do not like it

          If you're still waiting, you can find the stream here. You'll probably need to make an account first.