{"id":10831,"date":"2025-06-25T16:01:32","date_gmt":"2025-06-25T16:01:32","guid":{"rendered":"https:\/\/www.modernescpp.com\/?p=10831"},"modified":"2025-07-03T14:45:54","modified_gmt":"2025-07-03T14:45:54","slug":"data-parallel-types-a-first-example","status":"publish","type":"post","link":"https:\/\/www.modernescpp.com\/index.php\/data-parallel-types-a-first-example\/","title":{"rendered":"Data-Parallel Types &#8211; A First Example"},"content":{"rendered":"\n<p>After providing a theoretical introduction to the new C++ 26 feature in my last article, \u201c<a href=\"http:\/\/After providing a theoretical introduction to the new C++ 26 feature in my last article, \u201cData-Parallel Types (SIMD),\u201d I would like to follow up today with a practical example.\">Data-Parallel Types (SIMD)<\/a>,\u201d I would like to follow up today with a practical example.<\/p>\n\n\n\n<figure class=\"wp-block-image size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"755\" height=\"509\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/Time26ConcurrencySimd-1.png\" alt=\"\" class=\"wp-image-10788\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/Time26ConcurrencySimd-1.png 755w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/Time26ConcurrencySimd-1-300x202.png 300w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/Time26ConcurrencySimd-1-705x475.png 705w\" sizes=\"auto, (max-width: 755px) 100vw, 755px\" \/><\/figure>\n\n\n\n<p>The following introductory example is from the experimental implementation of the <a href=\"https:\/\/en.cppreference.com\/w\/cpp\/experimental\/simd.html\">SIMD library<\/a>. This functionality has been fully adopted in the C++ 26 draft under the name <a href=\"https:\/\/en.cppreference.com\/w\/cpp\/numeric\/simd.html\">Data-parallel types (SIMD)<\/a>. To port the program to the C++ 26 standard, it should be sufficient to replace the header<code> &lt;experimental\/simd&gt; <\/code>with <code>&lt;simd&gt;<\/code> and the namespace<code> std::experimental <\/code>with  <code>std::datapar.<\/code><\/p>\n\n\n\n<!-- HTML generated using hilite.me --><div style=\"background: #f0f3f3; overflow:auto;width:auto;gray;border-width:.1em .1em .1em .8em\"><pre style=\"margin: 0; line-height: 125%\"><span style=\"color: #009999\">#include &lt;experimental\/simd&gt;<\/span>\n<span style=\"color: #009999\">#include &lt;iostream&gt;<\/span>\n<span style=\"color: #009999\">#include &lt;string_view&gt;<\/span>\n<span style=\"color: #006699; font-weight: bold\">namespace<\/span> stdx <span style=\"color: #555555\">=<\/span> std<span style=\"color: #555555\">::<\/span>experimental;\n \n<span style=\"color: #007788; font-weight: bold\">void<\/span> <span style=\"color: #CC00FF\">println<\/span>(std<span style=\"color: #555555\">::<\/span>string_view name, <span style=\"color: #006699; font-weight: bold\">auto<\/span> <span style=\"color: #006699; font-weight: bold\">const<\/span><span style=\"color: #555555\">&amp;<\/span> a)\n{\n    std<span style=\"color: #555555\">::<\/span>cout <span style=\"color: #555555\">&lt;&lt;<\/span> name <span style=\"color: #555555\">&lt;&lt;<\/span> <span style=\"color: #CC3300\">&quot;: &quot;<\/span>;\n    <span style=\"color: #006699; font-weight: bold\">for<\/span> (std<span style=\"color: #555555\">::<\/span><span style=\"color: #007788; font-weight: bold\">size_t<\/span> i{}; i <span style=\"color: #555555\">!=<\/span> std<span style=\"color: #555555\">::<\/span>size(a); <span style=\"color: #555555\">++<\/span>i)\n        std<span style=\"color: #555555\">::<\/span>cout <span style=\"color: #555555\">&lt;&lt;<\/span> a[i] <span style=\"color: #555555\">&lt;&lt;<\/span> <span style=\"color: #CC3300\">&#39; &#39;<\/span>;\n    std<span style=\"color: #555555\">::<\/span>cout <span style=\"color: #555555\">&lt;&lt;<\/span> <span style=\"color: #CC3300\">&#39;\\n&#39;<\/span>;\n}\n \n<span style=\"color: #006699; font-weight: bold\">template<\/span><span style=\"color: #555555\">&lt;<\/span><span style=\"color: #006699; font-weight: bold\">class<\/span> <span style=\"color: #00AA88; font-weight: bold\">A<\/span><span style=\"color: #555555\">&gt;<\/span>\nstdx<span style=\"color: #555555\">::<\/span>simd<span style=\"color: #555555\">&lt;<\/span><span style=\"color: #007788; font-weight: bold\">int<\/span>, A<span style=\"color: #555555\">&gt;<\/span> my_abs(stdx<span style=\"color: #555555\">::<\/span>simd<span style=\"color: #555555\">&lt;<\/span><span style=\"color: #007788; font-weight: bold\">int<\/span>, A<span style=\"color: #555555\">&gt;<\/span> x)\n{\n    where(x <span style=\"color: #555555\">&lt;<\/span> <span style=\"color: #FF6600\">0<\/span>, x) <span style=\"color: #555555\">=<\/span> <span style=\"color: #555555\">-<\/span>x;\n    <span style=\"color: #006699; font-weight: bold\">return<\/span> x;\n}\n \n<span style=\"color: #007788; font-weight: bold\">int<\/span> main()\n{\n    <span style=\"color: #006699; font-weight: bold\">const<\/span> stdx<span style=\"color: #555555\">::<\/span>native_simd<span style=\"color: #555555\">&lt;<\/span><span style=\"color: #007788; font-weight: bold\">int<\/span><span style=\"color: #555555\">&gt;<\/span> a <span style=\"color: #555555\">=<\/span> <span style=\"color: #FF6600\">1<\/span>;\n    println(<span style=\"color: #CC3300\">&quot;a&quot;<\/span>, a);\n \n    <span style=\"color: #006699; font-weight: bold\">const<\/span> stdx<span style=\"color: #555555\">::<\/span>native_simd<span style=\"color: #555555\">&lt;<\/span><span style=\"color: #007788; font-weight: bold\">int<\/span><span style=\"color: #555555\">&gt;<\/span> b([](<span style=\"color: #007788; font-weight: bold\">int<\/span> i) { <span style=\"color: #006699; font-weight: bold\">return<\/span> i <span style=\"color: #555555\">-<\/span> <span style=\"color: #FF6600\">2<\/span>; });\n    println(<span style=\"color: #CC3300\">&quot;b&quot;<\/span>, b);\n \n    <span style=\"color: #006699; font-weight: bold\">const<\/span> <span style=\"color: #006699; font-weight: bold\">auto<\/span> c <span style=\"color: #555555\">=<\/span> a <span style=\"color: #555555\">+<\/span> b;\n    println(<span style=\"color: #CC3300\">&quot;c&quot;<\/span>, c);\n \n    <span style=\"color: #006699; font-weight: bold\">const<\/span> <span style=\"color: #006699; font-weight: bold\">auto<\/span> d <span style=\"color: #555555\">=<\/span> my_abs(c);\n    println(<span style=\"color: #CC3300\">&quot;d&quot;<\/span>, d);\n \n    <span style=\"color: #006699; font-weight: bold\">const<\/span> <span style=\"color: #006699; font-weight: bold\">auto<\/span> e <span style=\"color: #555555\">=<\/span> d <span style=\"color: #555555\">*<\/span> d;\n    println(<span style=\"color: #CC3300\">&quot;e&quot;<\/span>, e);\n \n    <span style=\"color: #006699; font-weight: bold\">const<\/span> <span style=\"color: #006699; font-weight: bold\">auto<\/span> inner_product <span style=\"color: #555555\">=<\/span> stdx<span style=\"color: #555555\">::<\/span>reduce(e);\n    std<span style=\"color: #555555\">::<\/span>cout <span style=\"color: #555555\">&lt;&lt;<\/span> <span style=\"color: #CC3300\">&quot;inner product: &quot;<\/span> <span style=\"color: #555555\">&lt;&lt;<\/span> inner_product <span style=\"color: #555555\">&lt;&lt;<\/span> <span style=\"color: #CC3300\">&#39;\\n&#39;<\/span>;\n \n    <span style=\"color: #006699; font-weight: bold\">const<\/span> stdx<span style=\"color: #555555\">::<\/span>fixed_size_simd<span style=\"color: #555555\">&lt;<\/span><span style=\"color: #007788; font-weight: bold\">long<\/span> <span style=\"color: #007788; font-weight: bold\">double<\/span>, <span style=\"color: #FF6600\">16<\/span><span style=\"color: #555555\">&gt;<\/span> x([](<span style=\"color: #007788; font-weight: bold\">int<\/span> i) { <span style=\"color: #006699; font-weight: bold\">return<\/span> i; });\n    println(<span style=\"color: #CC3300\">&quot;x&quot;<\/span>, x);\n    println(<span style=\"color: #CC3300\">&quot;cos\u00b2(x) + sin\u00b2(x)&quot;<\/span>, stdx<span style=\"color: #555555\">::<\/span>pow(stdx<span style=\"color: #555555\">::<\/span>cos(x), <span style=\"color: #FF6600\">2<\/span>) <span style=\"color: #555555\">+<\/span> stdx<span style=\"color: #555555\">::<\/span>pow(stdx<span style=\"color: #555555\">::<\/span>sin(x), <span style=\"color: #FF6600\">2<\/span>));\n}\n<\/pre><\/div>\n\n\n\n<p>Before I proceed with the program, I would like to introduce the output.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large is-resized\"><img loading=\"lazy\" decoding=\"async\" width=\"1030\" height=\"378\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/simd-1030x378.png\" alt=\"\" class=\"wp-image-10833\" style=\"width:500px\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/simd-1030x378.png 1030w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/simd-300x110.png 300w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/simd-768x282.png 768w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/simd-705x259.png 705w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2025\/06\/simd.png 1200w\" sizes=\"auto, (max-width: 1030px) 100vw, 1030px\" \/><\/figure>\n\n\n\n<p>First, I would like to focus on the <code>println<\/code> and <code>my_abs<\/code> functions. The <code>println<\/code> function outputs the name and content of a SIMD vector, iterating through its elements. <code>my_abs<\/code> calculates the absolute value of each element in a SIMD vector with integers, using <code>where <\/code>to conditionally negate negative values.<\/p>\n\n\n\n<p>The <code>main<\/code> function is much more interesting. In the SIMD vector <code>a<\/code>, each element is set to 1, whereas in the SIMD vector<code> b<\/code>, thanks to the lambda function, each element is initialized so that it has its index minus 2. By default, <a href=\"https:\/\/en.wikipedia.org\/wiki\/SSE2\">SSE2<\/a> instructions are used via<code> const stdx::native_simd<\/code>. These SIMD vectors are 128 bits in size. Now the arithmetic begins. Vector <code>c<\/code> is the element-wise sum of<code> a <\/code>and<code> b<\/code>,<code> d<\/code> is the element-wise absolute value of <code>c<\/code>, and vector<code> e <\/code>is the element-wise square of<code> d<\/code>. Finally<code>, stdx::reduce(e)<\/code> is used. This reduces vector<code> e <\/code>to its sum. <\/p>\n\n\n\n<p>The expression <code>const stdx::fixed_size_simd&lt;long double, 16> x([](int i) { return i; })<\/code> is particularly interesting. It initializes the SIMD vector <code>x<\/code> with 16 long double values from 0 to 15. This is possible if the architecture is sufficiently modern and supports <a href=\"https:\/\/en.wikipedia.org\/wiki\/X86_SIMD_instruction_listings#SSE2_instructions\">AVX-252. <\/a>This applies, for example, to Intel&#8217;s Xeon Phi or AMD&#8217;s Zen 4 architecture. Similarly interesting is the line <code>println(\u201ccos\u00b2(x) + sin\u00b2(x)\u201d, stdx::pow(stdx::cos(x), 2) + stdx::pow(stdx::sin(x), 2))<\/code>. This calculates<code> cos\u00b2(x) + sin\u00b2(x)<\/code> for each element, which is 1 for all elements due to the trigonometric identity of Pythagoras. All functions in <a href=\"https:\/\/en.cppreference.com\/w\/cpp\/header\/cmath.html\">&lt;cmath> <\/a>except for the special mathematical functions for simd are overloaded. These include basic functions such as <kbd>abs, min, and max.<\/kbd> However, exponential, power, trigonometric, hyperbolic, and gamma functions can also be applied directly to SIMD vectors.<\/p>\n\n\n\n<p>Now I would like to go into more detail about the width of the data type <code>simd&lt;T&gt;<\/code>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Width of <code>simd&lt;T&gt;<\/code><\/h2>\n\n\n\n<p>The width of the data type <code>native_simd<\/code>&lt;<code>T&gt;<\/code> is determined by the implementation at compile time. In contrast, the developer specifies the width of the data type <code>fixed_size_simd<\/code>&lt;<code>T&gt;<\/code>.<\/p>\n\n\n\n<p>The class template simd has the following declaration:<\/p>\n\n\n\n<!-- HTML generated using hilite.me --><div style=\"background: #f0f3f3; overflow:auto;width:auto;gray;border-width:.1em .1em .1em .8em\"><pre style=\"margin: 0; line-height: 125%\"><span style=\"color: #006699; font-weight: bold\">template<\/span><span style=\"color: #555555\">&lt;<\/span> <span style=\"color: #006699; font-weight: bold\">class<\/span> <span style=\"color: #00AA88; font-weight: bold\">T<\/span>, <span style=\"color: #006699; font-weight: bold\">class<\/span> <span style=\"color: #00AA88; font-weight: bold\">Abi<\/span> <span style=\"color: #555555\">=<\/span> simd_abi<span style=\"color: #555555\">::<\/span>compatible <span style=\"color: #555555\">&gt;<\/span>\n<span style=\"color: #006699; font-weight: bold\">class<\/span> <span style=\"color: #00AA88; font-weight: bold\">simd<\/span>;\n<\/pre><\/div>\n\n\n\n<p>Here,<code> T <\/code>stands for the element type, which cannot be <code>bool<\/code>. The <code>Abi <\/code>tag determines the number of elements and their memory.<\/p>\n\n\n\n<p>There are two aliases for this class template:<\/p>\n\n\n\n<!-- HTML generated using hilite.me --><div style=\"background: #f0f3f3; overflow:auto;width:auto;gray;border-width:.1em .1em .1em .8em\"><pre style=\"margin: 0; line-height: 125%\"><span style=\"color: #006699; font-weight: bold\">template<\/span><span style=\"color: #555555\">&lt;<\/span> <span style=\"color: #006699; font-weight: bold\">class<\/span> <span style=\"color: #00AA88; font-weight: bold\">T<\/span>, <span style=\"color: #007788; font-weight: bold\">int<\/span> N <span style=\"color: #555555\">&gt;<\/span>\n<span style=\"color: #006699; font-weight: bold\">using<\/span> fixed_size_simd <span style=\"color: #555555\">=<\/span> std<span style=\"color: #555555\">::<\/span>experimental<span style=\"color: #555555\">::<\/span>simd<span style=\"color: #555555\">&lt;<\/span>T, std<span style=\"color: #555555\">::<\/span>experimental<span style=\"color: #555555\">::<\/span>simd_abi<span style=\"color: #555555\">::<\/span>fixed_size<span style=\"color: #555555\">&lt;<\/span>N<span style=\"color: #555555\">&gt;&gt;<\/span>;\n\t\t\n<span style=\"color: #006699; font-weight: bold\">template<\/span><span style=\"color: #555555\">&lt;<\/span> <span style=\"color: #006699; font-weight: bold\">class<\/span> <span style=\"color: #00AA88; font-weight: bold\">T<\/span> <span style=\"color: #555555\">&gt;<\/span>\n<span style=\"color: #006699; font-weight: bold\">using<\/span> native_simd <span style=\"color: #555555\">=<\/span> std<span style=\"color: #555555\">::<\/span>experimental<span style=\"color: #555555\">::<\/span>simd<span style=\"color: #555555\">&lt;<\/span>T, std<span style=\"color: #555555\">::<\/span>experimental<span style=\"color: #555555\">::<\/span>simd_abi<span style=\"color: #555555\">::<\/span>native<span style=\"color: #555555\">&lt;<\/span>T<span style=\"color: #555555\">&gt;&gt;<\/span>;\n<\/pre><\/div>\n\n\n\n\n<p>The following ABI tags are available:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><code>scalar<\/code>: storing a single element<\/li>\n\n\n\n<li><code>fixed_size<\/code>: storing a specified number of elements<\/li>\n\n\n\n<li><code>compatible<\/code>: ensures ABI compatibility<\/li>\n\n\n\n<li><code>native<\/code>: most efficient<\/li>\n\n\n\n<li><code>max_fixed_size<\/code>: maximum number of elements guaranteed to be supported by <code>fixed_size<\/code><\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">What&#8217;s next? <\/h2>\n\n\n\n<p>After this initial example of data parallel types, I would like to take a closer look at their functionality in the next article.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>After providing a theoretical introduction to the new C++ 26 feature in my last article, \u201cData-Parallel Types (SIMD),\u201d I would like to follow up today with a practical example. The following introductory example is from the experimental implementation of the SIMD library. This functionality has been fully adopted in the C++ 26 draft under the [&hellip;]<\/p>\n","protected":false},"author":21,"featured_media":10788,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[559],"tags":[566],"class_list":["post-10831","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-c26-blog","tag-data-parallel-types"],"_links":{"self":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/10831","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/comments?post=10831"}],"version-history":[{"count":9,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/10831\/revisions"}],"predecessor-version":[{"id":10842,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/10831\/revisions\/10842"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/media\/10788"}],"wp:attachment":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/media?parent=10831"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/categories?post=10831"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/tags?post=10831"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}