Data-Parallel Types (SIMD)

The data-parallel types (SIMD) library provides data-parallel types and operations on them. Today, I want to take a closer look at this.

Before diving into the new library, I would like to take a moment to discuss SIMD.

SIMD

Vectorization refers to the SIMD (Single Instruction, Multiple Data) extensions of a modern processor’s instruction set. SIMD enables your processor to execute one operation in parallel on several data.

A Simple Example

Whether an algorithm runs in a parallel and vectorized way depends on many factors. It depends on whether the CPU and the operating system support SIMD instructions. Additionally, it’s a question of the compiler and the optimization level you used to compile your code.

// SIMD.cpp

const int SIZE= 8;

int vec[]={1,2,3,4,5,6,7,8};
int res[SIZE]={0,};

int main(){
  for (int i= 0; i < SIZE; ++i) {
    res[i]= vec[i]+5; // (1)
  }
}

Line 1 is the key line in the small program. Thanks to the Compiler Explorer, it is quite easy to generate the assembler instructions for Clang 3.6 with and without maximum optimization (-O3).

Without Optimization

Although my time fiddling with assembler instructions is long gone, it’s evident that all is done sequentially.

With maximum Optimization

I get instructions that run in parallel on several datasets by using maximum optimization.

The move operation (movdqa) and the add operation (paddd) use the special registers xmm0 and xmm1. Both registers are so-called SSE registers and have a width of 128 Bits. This allows you to process 4 ints in one go. SSE stands for Streaming SIMD Extensions.

Unfortunately, vector instructions are highly dependent on the architecture. Neither the instructions nor the register widths are uniform. Modern Intel architectures mostly support AVX2 or even AVX-512. This enables 256-bit or 512-bit operations. This means that 8 or 16 ints can be processed in parallel. AVX stands for Advanced Vector Extension.

This is exactly where the new library data-parallel types come in, offering a unified interface to vector instructions.

Data-parallel types (SIMD)

Before diving into the new library, a few definitions are necessary. These definitions refer to proposal P1928R15. In total, the new library comprises six proposals.

Modernes C++ Mentoring

"Fundamentals for C++ Professionals" (open)

"Design Patterns and Architectural Patterns with C++" (open)

"C++20: Get the Details" (open)

"Concurrency with Modern C++" (open)

"Embedded Programming with Modern C++": (open)

"Generic Programming (Templates) with C++": (open)

"Clean Code: Best Practices for Modern C++": September 2025

Do you want to stay informed: Subscribe.

The set of vectorizable types comprises all standard integer types, character types, and the types float and double. In addition, std::float16_t, std::float32_t, and std::float64_t are vectorizable types if defined.

The term data-parallel type refers to all enabled specializations of the basic_simd and basic_simd_mask class templates. A data-parallel object is an object of data-parallel type.

A data-parallel type consists of one or more elements of an underlying vectorizable type, called the element type. The number of elements is a constant for each data-parallel type and called the width of that type. The elements in a data-parallel type are indexed from 0 to width − 1.

An element-wise operation applies a specified operation to the elements of one or more data-parallel objects. Each such application is unsequenced with respect to the others. A unary element-wise operation is an element-wise operation that applies a unary operation to each element of a data-parallel object. A binary element-wise operation is an element-wise operation that applies a binary operation to corresponding elements of two data-parallel objects.

After so much theory, I would now like to show a small example. This is from Matthias Kretz, author of proposal P1928R15. The example from his presentation at CppCon 2023 shows a function f that takes a vector and maps its elements to their sine values.

void f(std::vector<float>& data) {
    using floatv = std::simd<float>;
    for (auto it = data.begin(); it < data.end(); it += floatv::size()) {
        floatv v(it);
        v = std::sin(v);
        v.copy_to(it);
    }
}

The function f takes a vector of floats (data) as a reference. It defines floatv as a SIMD vector of floats using std::simd<float>. f iterates through the vector in blocks, each block having the size of the SIMD vector.

For each block:

Loads the block into a SIMD vector (floatv v(it);).
Applies the sine function to all elements in the SIMD vector simultaneously (v = std::sin(v);).
Writes the results back to the original vector (v.copy_to(it);).

The handling of SIMD instructions will become particularly elegant when proposal P0350R4 is implemented in C++26. For example, SIMD can then be used as a new execution policy in algorithms:

void f(std::vector<float>& data) {
    std::for_each(std::execution::simd, data.begin(), data.end(), [](auto& v) {
        v = std::sin(v);
    });
}

What’s next?

In my next article, I’ll dive deeper into the new SIMD library.

Post Views: 7,388

Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Jozo Leko, John Breland, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Mario Luoni, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Daniel Hufschläger, Alessandro Pezzato, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Michael Dunsky, Leo Goodstadt, John Wiederhirn, Yacob Cohen-Arazi, Florian Tischler, Robin Furness, Michael Young, Holger Detering, Bernd Mühlhaus, Stephen Kelley, Kyle Dean, Tusar Palauri, Juan Dent, George Liao, Daniel Ceperley, Jon T Hess, Stephen Totten, Wolfgang Fütterer, Matthias Grün, Phillip Diekmann, Ben Atakora, Ann Shatoff, Rob North, Bhavith C Achar, Marco Parri Empoli, Philipp Lenk, Charles-Jianye Chen, Keith Jeffery, Matt Godbolt, Honey Sukesan, bruce_lee_wayne, Silviu Ardelean, schnapper79, Seeker, and Sundareswaran Senthilvel.

Thanks, in particular, to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, John Nebel, Mipko, Alicja Kaminska, Slavko Radman, and David Poole.

My special thanks to Embarcadero
My special thanks to PVS-Studio
My special thanks to Tipi.build
My special thanks to Take Up Code
My special thanks to SHAVEDYAKS