{"id":5428,"date":"2018-04-21T04:17:06","date_gmt":"2018-04-21T04:17:06","guid":{"rendered":"https:\/\/www.modernescpp.com\/index.php\/c-core-guidelines-rules-for-concurrency-and-parallelism\/"},"modified":"2023-06-26T11:52:23","modified_gmt":"2023-06-26T11:52:23","slug":"c-core-guidelines-rules-for-concurrency-and-parallelism","status":"publish","type":"post","link":"https:\/\/www.modernescpp.com\/index.php\/c-core-guidelines-rules-for-concurrency-and-parallelism\/","title":{"rendered":"C++ Core Guidelines: Rules for Concurrency and Parallelism"},"content":{"rendered":"<p><span style=\"color: #000000; font-family: 'Noto Serif', serif; font-size: 16px; font-style: normal; font-weight: 400; letter-spacing: normal; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; word-spacing: 0px; background-color: #ffffff; float: none;\">C++11 is the first C++ standard that deals with concurrency. The basic building block for concurrency is a thread; therefore, most rules are explicitly about threads. This changed dramatically with C++17.<br \/><\/span><\/p>\n<p>&nbsp;<\/p>\n<p><!--more--><\/p>\n<p>&nbsp;<img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-5426\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2018\/04\/timeline11_14.png\" alt=\"timeline11 14\" width=\"700\" height=\"350\" style=\"display: block; margin-left: auto; margin-right: auto;\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2018\/04\/timeline11_14.png 865w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2018\/04\/timeline11_14-300x150.png 300w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2018\/04\/timeline11_14-768x384.png 768w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><span>With <strong>C++17<\/strong> we got the Standard Template Library (STL) parallel algorithms. That means, most of the algorithms of the STL can be executed sequential, parallel, or vectorized. For the curious reader: I have already written two posts to the parallel STL. The post <a href=\"https:\/\/www.modernescpp.com\/index.php\/parallel-algorithm-of-the-standard-template-library\">Parallel Algorithms of the Standard Template Librar<\/a>y explains the execution policy in which you can run an existing algorithm sequentially, parallel, or parallel and vectorize. C++17 also gave new algorithms meant to run in parallel or vectorized. Here are the details: <a href=\"https:\/\/www.modernescpp.com\/index.php\/c-17-new-algorithm-of-the-standard-template-library\">C++17: New Parallel Algorithms of the Standard Template Library.<\/a><br \/><\/span><\/p>\n<p><span>The concurrency story in C++ goes on. With C++20, we can hope for extended futures, coroutines, transactions, and more. From the bird&#8217;s eye view, the concurrency facilities of C++11 and C++14 are only the implementation details on which the higher abstraction of C++17 and C++20 are based. Here is a series of posts about the concurrent future in<a href=\"https:\/\/www.modernescpp.com\/index.php\/category\/multithreading-c-17-and-c-20\"> C++20<\/a>.<br \/><\/span><\/p>\n<p>Said that the rules are mainly about threads because neither GCC nor Clang or MSVC&nbsp; has fully implemented the parallel algorithms of the STL. Best practices cannot be written to features that are not available (parallel STL) or even not standardized. &nbsp;<\/p>\n<p>This is the first rule to remember when you read the rules. These rules are about available multithreading in C++11 and C++14. The second rule to keep in mind is that multithreading is very challenging. This means the rules want to give guidance to the novice and not to the experts in this domain. The rules of the memory model will follow in the future.<\/p>\n<p>Now, let&#8217;s start and dive into the first rule.<\/p>\n<h2><a href=\"http:\/\/isocpp.github.io\/CppCoreGuidelines\/CppCoreGuidelines#Rconc-multi\">CP.1: Assume that your code will run as part of a multi-threaded program<\/a><\/h2>\n<p>I was astonished when I read this rule the first time. Why should I optimize for the special case? To make it clear, this rule is mainly about code that is used in libraries, not in the application. And the experience shows that library code is often reused. This means you maybe optimize for the general case, which is fine.<\/p>\n<p>To make the point of the rule clear, here is a small example.<\/p>\n<p>&nbsp;<\/p>\n<div style=\"background: #f0f3f3; overflow: auto; width: auto; gray;border-width: .1em .1em .1em .8em;\">\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #007788; font-weight: bold;\">double<\/span> <span style=\"color: #cc00ff;\">cached_computation<\/span>(<span style=\"color: #007788; font-weight: bold;\">double<\/span> x)\r\n{\r\n    <span style=\"color: #006699; font-weight: bold;\">static<\/span> <span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_x <span style=\"color: #555555;\">=<\/span> <span style=\"color: #ff6600;\">0.0<\/span>;                       <span style=\"color: #0099ff; font-style: italic;\">\/\/ (1)<\/span>\r\n    <span style=\"color: #006699; font-weight: bold;\">static<\/span> <span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_result <span style=\"color: #555555;\">=<\/span> COMPUTATION_OF_ZERO;  <span style=\"color: #0099ff; font-style: italic;\">\/\/ (2)<\/span>\r\n    <span style=\"color: #007788; font-weight: bold;\">double<\/span> result;\r\n\r\n    <span style=\"color: #006699; font-weight: bold;\">if<\/span> (cached_x <span style=\"color: #555555;\">==<\/span> x)                                  <span style=\"color: #0099ff; font-style: italic;\">\/\/ (1)<\/span>\r\n        <span style=\"color: #006699; font-weight: bold;\">return<\/span> cached_result;                           <span style=\"color: #0099ff; font-style: italic;\">\/\/ (2)<\/span>\r\n    result <span style=\"color: #555555;\">=<\/span> computation(x);\r\n    cached_x <span style=\"color: #555555;\">=<\/span> x;                                       <span style=\"color: #0099ff; font-style: italic;\">\/\/ (1)<\/span>\r\n    cached_result <span style=\"color: #555555;\">=<\/span> result;                             <span style=\"color: #0099ff; font-style: italic;\">\/\/ (2)<\/span>\r\n    <span style=\"color: #006699; font-weight: bold;\">return<\/span> result;\r\n}\r\n<\/pre>\n<\/div>\n<p>&nbsp;<\/p>\n<p>The function <span style=\"font-family: courier\\ new, courier;\">cached_computation<\/span> is fine if it runs in a single-threaded environment. This will not hold for a multithreading environment because the static variables <span style=\"font-family: courier\\ new, courier;\">cached_x<\/span> (1) and <span style=\"font-family: courier\\ new, courier;\">cached_result<\/span> (2) can be used simultaneously by many threads, and they are modified during their usage. The C++11 standard adds multithreading semantics to static variables with block scope such as cached_x and cached_result. <strong>Static variables with block scope are initialized in C++11 in a thread-safe way.<\/strong><\/p>\n<p>This is fine but will not help in our case. We will get a data race if we invoke <span style=\"font-family: courier\\ new, courier;\">cached_computation<\/span> simultaneously from many threads. The notion of a data race is essential in multithreading in C++; therefore, let me write about it.&nbsp;<\/p>\n<p>A <strong>data race<\/strong> is a situation, in which at least two threads access a shared variable simultaneously. At least one thread tries to modify the variable.<\/p>\n<p>The rest is quite simple. If you have a data race in your program, your program has undefined behavior. Undefined behavior means you can not reason anymore about your program because all can happen. I mean all. In my seminars, I often say: If your program has undefined behavior, it has catch-fire semantics. Even your computer can catch fire.<\/p>\n<p>If you read the definition of data race quite carefully, you will notice that a shared mutable state is necessary for having a data race. Here is a picture to make this observation quite obvious.<\/p>\n<p>&nbsp;<img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-5427\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2018\/04\/SharedMutable.png\" alt=\"SharedMutable\" width=\"300\" height=\"211\" style=\"display: block; margin-left: auto; margin-right: auto;\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2018\/04\/SharedMutable.png 496w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2018\/04\/SharedMutable-300x211.png 300w\" sizes=\"auto, (max-width: 300px) 100vw, 300px\" \/><\/p>\n<p>So, what can you do to get rid of the data race? Making the static variables <span style=\"font-family: courier\\ new, courier;\">cached_x<\/span> (1) and <span style=\"font-family: courier\\ new, courier;\">cached_result<\/span> (2) immutable (<span style=\"font-family: courier\\ new, courier;\">const<\/span>) makes no sense. This means both static should not be shared. Here are a few ways to achieve this.<\/p>\n<ol>\n<li>Protect both static by their own lock.<\/li>\n<li>Use one lock to protect the entire critical region.<\/li>\n<li>Protect the call to the function<span style=\"font-family: courier\\ new, courier;\"> cached_computation <\/span>by a lock. <span style=\"font-family: courier\\ new, courier;\"><\/span><span style=\"font-family: courier\\ new, courier;\"><\/span><\/li>\n<li><span>Make both static<span style=\"font-family: courier\\ new, courier;\"> thread_local<\/span><\/span><span style=\"font-family: courier\\ new, courier;\">. tread_local<\/span> guarantees that each thread gets its variable <span style=\"font-family: courier\\ new, courier;\">cached_x<\/span> and<span style=\"font-family: courier\\ new, courier;\"> cached_result<\/span>. Such as, a static variable is bound to the lifetime of the main thread, and a <span style=\"font-family: courier\\ new, courier;\">thread_loca<\/span>l variable is bound to the lifetime of its thread.<\/li>\n<\/ol>\n<p>Here are variations 1, 2, 3, and 4.<\/p>\n<div style=\"background: #f0f3f3; overflow: auto; width: auto; gray;border-width: .1em .1em .1em .8em;\">\n<pre style=\"margin: 0; line-height: 125%;\">std<span style=\"color: #555555;\">::<\/span>mutex m_x;\r\nstd<span style=\"color: #555555;\">::<\/span>mutex m_result;\r\n<span style=\"color: #007788; font-weight: bold;\">double<\/span> <span style=\"color: #cc00ff;\">cached_computation<\/span>(<span style=\"color: #007788; font-weight: bold;\">double<\/span> x){                <span style=\"color: #0099ff; font-style: italic;\">\/\/ (1)<\/span>\r\n    <span style=\"color: #006699; font-weight: bold;\">static<\/span> <span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_x <span style=\"color: #555555;\">=<\/span> <span style=\"color: #ff6600;\">0.0<\/span>;\r\n    <span style=\"color: #006699; font-weight: bold;\">static<\/span> <span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_result <span style=\"color: #555555;\">=<\/span> COMPUTATION_OF_ZERO;\r\n\t\r\n    <span style=\"color: #007788; font-weight: bold;\">double<\/span> result;\r\n    {\r\n\tstd<span style=\"color: #555555;\">::<\/span>scoped_lock(m_x, m_result);\r\n        <span style=\"color: #006699; font-weight: bold;\">if<\/span> (cached_x <span style=\"color: #555555;\">==<\/span> x) <span style=\"color: #006699; font-weight: bold;\">return<\/span> cached_result;\r\n    }\r\n    result <span style=\"color: #555555;\">=<\/span> computation(x);\r\n    {\r\n\tstd<span style=\"color: #555555;\">::<\/span>lock_guard<span style=\"color: #555555;\">&lt;<\/span>std<span style=\"color: #555555;\">::<\/span>mutex<span style=\"color: #555555;\">&gt;<\/span> lck(m_x);\r\n        cached_x <span style=\"color: #555555;\">=<\/span> x;\r\n    }\r\n    { \r\n\tstd<span style=\"color: #555555;\">::<\/span>lock_guard<span style=\"color: #555555;\">&lt;<\/span>std<span style=\"color: #555555;\">::<\/span>mutex<span style=\"color: #555555;\">&gt;<\/span> lck(m_result);  \r\n        cached_result <span style=\"color: #555555;\">=<\/span> result;\r\n    }\r\n    <span style=\"color: #006699; font-weight: bold;\">return<\/span> result;\r\n}\r\n\r\nstd<span style=\"color: #555555;\">::<\/span>mutex m;\r\n<span style=\"color: #007788; font-weight: bold;\">double<\/span> <span style=\"color: #cc00ff;\">cached_computation<\/span>(<span style=\"color: #007788; font-weight: bold;\">double<\/span> x){                <span style=\"color: #0099ff; font-style: italic;\">\/\/ (2)<\/span>\r\n    <span style=\"color: #006699; font-weight: bold;\">static<\/span> <span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_x <span style=\"color: #555555;\">=<\/span> <span style=\"color: #ff6600;\">0.0<\/span>;\r\n    <span style=\"color: #006699; font-weight: bold;\">static<\/span> <span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_result <span style=\"color: #555555;\">=<\/span> COMPUTATION_OF_ZERO;\r\n    <span style=\"color: #007788; font-weight: bold;\">double<\/span> result;\r\n    {\r\n        std<span style=\"color: #555555;\">::<\/span>lock_guard<span style=\"color: #555555;\">&lt;<\/span>std<span style=\"color: #555555;\">::<\/span>mutex<span style=\"color: #555555;\">&gt;<\/span> lck(m);\r\n\t<span style=\"color: #006699; font-weight: bold;\">if<\/span> (cached_x <span style=\"color: #555555;\">==<\/span> x) <span style=\"color: #006699; font-weight: bold;\">return<\/span> cached_result;\r\n\tresult <span style=\"color: #555555;\">=<\/span> computation(x);\r\n\tcached_x <span style=\"color: #555555;\">=<\/span> x;\r\n\tcached_result <span style=\"color: #555555;\">=<\/span> result;\r\n    }\r\n    <span style=\"color: #006699; font-weight: bold;\">return<\/span> result;\r\n}\r\n\r\nstd<span style=\"color: #555555;\">::<\/span>mutex cachedComputationMutex;                  <span style=\"color: #0099ff; font-style: italic;\">\/\/ (3)<\/span>\r\n{\r\n    std<span style=\"color: #555555;\">::<\/span>lock_guard<span style=\"color: #555555;\">&lt;<\/span>std<span style=\"color: #555555;\">::<\/span>mutex<span style=\"color: #555555;\">&gt;<\/span> lck(cachedComputationMutex);\r\n    <span style=\"color: #006699; font-weight: bold;\">auto<\/span> cached <span style=\"color: #555555;\">=<\/span> cached_computation(<span style=\"color: #ff6600;\">3.33<\/span>);\r\n}\r\n\r\n\r\n<span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_computation(<span style=\"color: #007788; font-weight: bold;\">double<\/span> x){                <span style=\"color: #0099ff; font-style: italic;\">\/\/ (4)<\/span>\r\n    thread_local <span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_x <span style=\"color: #555555;\">=<\/span> <span style=\"color: #ff6600;\">0.0<\/span>;\r\n    thread_local <span style=\"color: #007788; font-weight: bold;\">double<\/span> cached_result <span style=\"color: #555555;\">=<\/span> COMPUTATION_OF_ZERO;\r\n    <span style=\"color: #007788; font-weight: bold;\">double<\/span> result;\r\n\r\n    <span style=\"color: #006699; font-weight: bold;\">if<\/span> (cached_x <span style=\"color: #555555;\">==<\/span> x) <span style=\"color: #006699; font-weight: bold;\">return<\/span> cached_result;\r\n    result <span style=\"color: #555555;\">=<\/span> computation(x);\r\n    cached_x <span style=\"color: #555555;\">=<\/span> x;\r\n    cached_result <span style=\"color: #555555;\">=<\/span> result;\r\n    <span style=\"color: #006699; font-weight: bold;\">return<\/span> result;\r\n}\r\n<\/pre>\n<\/div>\n<p>&nbsp;<\/p>\n<p>First, the C++11 standard guarantees that static variables are initialized in a thread-safe way; therefore, I don&#8217;t have to protect their initialization in all programs.<\/p>\n<ol>\n<li>This version is tricky because I have to acquire both locks in an atomic step. C++17 supports <span style=\"font-family: courier\\ new, courier;\">std::scoped_lock<\/span>, which can lock an arbitrary number of mutexes in an atomic step. In C++11, you have to use instead of a <span style=\"font-family: courier\\ new, courier;\">std::unqiue_lock<\/span> in combination with the function<span style=\"font-family: courier\\ new, courier;\"> std::lock<\/span>.&nbsp; My previous post <a href=\"https:\/\/www.modernescpp.com\/index.php\/prefer-locks-to-mutexes\">Prefer Locks to Mutexes <\/a>provides you with more details. <strong>This solution has a race condition on <span style=\"font-family: courier\\ new, courier;\">cached_x<\/span> and <span style=\"font-family: courier\\ new, courier;\">cached_result<\/span> because they must be accessed atomically.<\/strong><\/li>\n<li>Version 2 uses a more coarse-grained locking. Usually, you should not use coarse-grained locks such as in the version but instead use fine-grained locking, but in this use case, it may be fine.<\/li>\n<li>This is the most coarse-grained solution because the entire function is locked. Of course, the downside is that the user of the function is responsible for the synchronization. In general, that is a bad idea.<\/li>\n<li>Just make the static variables <span style=\"font-family: courier\\ new, courier;\">thread_loca<\/span>l, and you are done<span style=\"font-family: courier\\ new, courier;\"><\/span><span style=\"font-family: courier\\ new, courier;\"><\/span><span style=\"font-family: courier\\ new, courier;\"><\/span><span style=\"font-family: courier\\ new, courier;\"> <\/span><\/li>\n<\/ol>\n<p>In the end, it is a question of performance and your users. Therefore try each variation, measure, and think about the people who should use and maintain your code.<\/p>\n<h2>What&#8217;s next?<\/h2>\n<p>This post was just the starting point through a long journey of rules to concurrency in C++. In the <a href=\"https:\/\/www.modernescpp.com\/index.php\/c-core-guidelines-more-rules-to-concurrency-and-parallelism\">next post<\/a>, I will take about threads and shared state.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<\/p>\n<div id=\"s3gt_translate_tooltip_mini\" class=\"s3gt_translate_tooltip_mini_box\" style=\"background: initial !important; border: initial !important; border-radius: initial !important; border-spacing: initial !important; border-collapse: initial !important; direction: ltr !important; flex-direction: initial !important; font-weight: initial !important; height: initial !important; letter-spacing: initial !important; min-width: initial !important; max-width: initial !important; min-height: initial !important; max-height: initial !important; margin: auto !important; outline: initial !important; padding: initial !important; position: absolute; table-layout: initial !important; text-align: initial !important; text-shadow: initial !important; width: initial !important; word-break: initial !important; word-spacing: initial !important; overflow-wrap: initial !important; box-sizing: initial !important; display: initial !important; color: inherit !important; font-size: 13px !important; font-family: X-LocaleSpecific, sans-serif, Tahoma, Helvetica !important; line-height: 13px !important; vertical-align: top !important; white-space: inherit !important; left: 206px; top: 1744px; opacity: 0.6;\">&nbsp;<\/div>\n<div id=\"s3gt_translate_tooltip_mini\" class=\"s3gt_translate_tooltip_mini_box\" style=\"background: initial !important; border: initial !important; border-radius: initial !important; border-spacing: initial !important; border-collapse: initial !important; direction: ltr !important; flex-direction: initial !important; font-weight: initial !important; height: initial !important; letter-spacing: initial !important; min-width: initial !important; max-width: initial !important; min-height: initial !important; max-height: initial !important; margin: auto !important; outline: initial !important; padding: initial !important; position: absolute; table-layout: initial !important; text-align: initial !important; text-shadow: initial !important; width: initial !important; word-break: initial !important; word-spacing: initial !important; overflow-wrap: initial !important; box-sizing: initial !important; display: initial !important; color: inherit !important; font-size: 13px !important; font-family: X-LocaleSpecific, sans-serif, Tahoma, Helvetica !important; line-height: 13px !important; vertical-align: top !important; white-space: inherit !important; left: 654px; top: 1899px; opacity: 0.3;\">&nbsp;<\/div>\n","protected":false},"excerpt":{"rendered":"<p>C++11 is the first C++ standard that deals with concurrency. The basic building block for concurrency is a thread; therefore, most rules are explicitly about threads. This changed dramatically with C++17. &nbsp;<\/p>\n","protected":false},"author":21,"featured_media":5426,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[372],"tags":[430,487],"class_list":["post-5428","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-modern-c","tag-lock","tag-thread_local"],"_links":{"self":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/5428","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/comments?post=5428"}],"version-history":[{"count":1,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/5428\/revisions"}],"predecessor-version":[{"id":6830,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/5428\/revisions\/6830"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/media\/5426"}],"wp:attachment":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/media?parent=5428"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/categories?post=5428"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/tags?post=5428"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}