{"id":4846,"date":"2016-08-03T07:11:18","date_gmt":"2016-08-03T07:11:18","guid":{"rendered":"https:\/\/www.modernescpp.com\/index.php\/ongoing-optimization-2\/"},"modified":"2023-06-26T12:47:37","modified_gmt":"2023-06-26T12:47:37","slug":"ongoing-optimization-2","status":"publish","type":"post","link":"https:\/\/www.modernescpp.com\/index.php\/ongoing-optimization-2\/","title":{"rendered":"Ongoing Optimization: Unsynchronized Access with CppMem"},"content":{"rendered":"<p>I described my challenge in the last post. Let&#8217;s &#8216;s start with our process of <a href=\"https:\/\/www.modernescpp.com\/index.php\/ongoing-optimization\">ongoing optimization<\/a>. To be sure, I verify my reasoning with <a href=\"https:\/\/www.modernescpp.com\/index.php\/cppmem-an-overview\">CppMem.<\/a> I&nbsp;once made a big mistake in my presentation at <a href=\"https:\/\/www.youtube.com\/watch?v=paK38WAq8WY\">Meeting C++ 2014<\/a>.<\/p>\n<p><!--more--><\/p>\n<p>&nbsp;<\/p>\n<p>To remind you. That is our starting point.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"The_program\"><\/span>The program<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; gray;border-width: .1em .1em .1em .8em;\">\n<table>\n<tbody>\n<tr>\n<td>\n<pre style=\"margin: 0; line-height: 125%;\"> 1\r\n 2\r\n 3\r\n 4\r\n 5\r\n 6\r\n 7\r\n 8\r\n 9\r\n10\r\n11\r\n12\r\n13\r\n14\r\n15\r\n16\r\n17\r\n18\r\n19\r\n20\r\n21\r\n22\r\n23\r\n24<\/pre>\n<\/td>\n<td>\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #008000;\">\/\/ ongoingOptimization.cpp<\/span>\r\n\r\n<span style=\"color: #0000ff;\">#include &lt;iostream&gt;<\/span>\r\n<span style=\"color: #0000ff;\">#include &lt;thread&gt;<\/span>\r\n\r\n<span style=\"color: #2b91af;\">int<\/span> x= 0;\r\n<span style=\"color: #2b91af;\">int<\/span> y= 0;\r\n\r\n<span style=\"color: #2b91af;\">void<\/span> writing(){\r\n  x= 2000;\r\n  y= 11;\r\n}\r\n\r\n<span style=\"color: #2b91af;\">void<\/span> reading(){ \r\n  std::cout &lt;&lt; <span style=\"color: #a31515;\">\"y: \"<\/span> &lt;&lt; y &lt;&lt; <span style=\"color: #a31515;\">\" \"<\/span>;\r\n  std::cout &lt;&lt; <span style=\"color: #a31515;\">\"x: \"<\/span> &lt;&lt; x &lt;&lt; std::endl;\r\n}\r\n\r\n<span style=\"color: #2b91af;\">int<\/span> main(){\r\n  std::<span style=\"color: #0000ff;\">thread<\/span> thread1(writing);\r\n  std::<span style=\"color: #0000ff;\">thread<\/span> thread2(reading);\r\n  thread1.join();\r\n  thread2.join();\r\n}\r\n<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Unsynchronized\"><\/span>Unsynchronized<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The program has two data races and therefore has undefined behavior. Either the access to the variable x or the variable y is protected. Because the program has undefined behavior, each result is possible. In C++ jargon, that means a cruise missile can launch, or your PC catches fire. To me, it never happened, but&#8230;<\/p>\n<p>So, we can make no statement about the values of x and y.<\/p>\n<p>&nbsp;<img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-4840\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/undefinedEng.png\" alt=\"undefinedEng\" style=\"margin: 15px;\" width=\"343\" height=\"240\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/undefinedEng.png 343w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/undefinedEng-300x210.png 300w\" sizes=\"auto, (max-width: 343px) 100vw, 343px\" \/><\/p>\n<h3><span class=\"ez-toc-section\" id=\"Its_not_so_bad\"><\/span>It&#8217;s not so bad<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>The known architectures guarantee that the access of an <span style=\"font-family: courier new,courier;\">int<\/span> variable is atomic. But the int variable must be naturally aligned. Naturally aligned means that&nbsp;on a 32-bit architecture, the int variable must have an address divisible by 4. On a 64-bit architecture, divisible by 8. There is a reason why I mention this so explicitly. With C++11, you can adjust the <a href=\"http:\/\/en.cppreference.com\/w\/cpp\/memory\/align\">alignment<\/a> of your data types.<\/p>\n<p>Once more. I don&#8217;t say that you should look at int variables as atomics. I only say that the compiler, in this case, guarantees more than the C++11 standard. But, if you use this rule, your program is not compliant with the C++ standard.<\/p>\n<p>This was my reasoning. Now we should look at what CppMem will say about the undefined behavior of the program.<\/p>\n<\/p>\n<h2><span class=\"ez-toc-section\" id=\"CppMem\"><\/span>CppMem<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; gray;border-width: .1em .1em .1em .8em;\">\n<table>\n<tbody>\n<tr>\n<td>\n<pre style=\"margin: 0; line-height: 125%;\"> 1\r\n 2\r\n 3\r\n 4\r\n 5\r\n 6\r\n 7\r\n 8\r\n 9\r\n10\r\n11\r\n12\r\n13<\/pre>\n<\/td>\n<td>\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #2b91af;\">int<\/span> main() {\r\n  <span style=\"color: #2b91af;\">int<\/span> x=0; \r\n  <span style=\"color: #2b91af;\">int<\/span> y=0;\r\n  {{{ { \r\n        x= 2000; \r\n        y= 11;\r\n      }\r\n  ||| {\r\n        y;\r\n        x;\r\n      }  \r\n  }}}\r\n}\r\n<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<p>The program is reduced to the bare minimum. You can easily define a thread with the curly braces (lines 4 and 12)&nbsp;and the pipe symbol (line 8). The additional curly braces in lines 4 and 7 or lines 8 and 11 define the work package of the thread. Because I&#8217;m not interested in the output of the variables x and y, I only read them in lines 9 and 10.<\/p>\n<p>That was the theory for CppMem. Now to the analysis.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Die_Analysis\"><\/span>Die Analysis<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>If I execute the program, CppMem complains in the red letters (<strong><span style=\"color: #ff0000;\">1<\/span><\/strong>), that all four possible interleavings of threads are not race free. Only the first execution is consistent. Now I can use CppMem to switch between the four executions (<strong><span style=\"color: #ff0000;\">2<\/span><\/strong>) and analyze the annotated graph (<strong><span style=\"color: #ff0000;\">3<\/span><\/strong>).<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-4841\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/FullExcection.png\" alt=\"FullExcection\" width=\"800\" height=\"600\" style=\"margin: 15px;\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/FullExcection.png 1149w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/FullExcection-300x225.png 300w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/FullExcection-1024x768.png 1024w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/FullExcection-768x576.png 768w\" sizes=\"auto, (max-width: 800px) 100vw, 800px\" \/><\/p>\n<p>We get the most out of CppMem from the graph. So I will dive more into the four graphs.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"First_execution\"><\/span>First execution<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Which information can we draw from paragraph(<span style=\"color: #ff0000;\"><strong>3<\/strong><\/span>)?<\/p>\n<p>&nbsp;<img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-4842\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/first.png\" alt=\"first\" width=\"500\" height=\"305\" style=\"margin: 15px;\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/first.png 380w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/first-300x183.png 300w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/p>\n<p>The graph nodes represent the program&#8217;s expressions, and the edges represent the expressions&#8217; relations. I will refer in my explanation to the names (a) to (f). So, what can I derive from the annotations in this concrete execution?<\/p>\n<ul>\n<li><strong>a:Wna x=0:<\/strong> Is the first expression (a), a non-atomic write of x.<\/li>\n<li><strong>sb (sequenced-before): <\/strong>The writing of the first expression (a) is sequenced before the writing of the second expression (b). These relations also hold between the expressions (c) and (d) or (e) and (f).<\/li>\n<li><strong>rf (read from)<\/strong>: The expression&nbsp;(e) reads the value of y from the expression (b). Accordingly, (f) reads from (a).<\/li>\n<li><strong>sw s(synchronizes-with)<\/strong>: The expression (a) synchronizes with (f). This relation holds because the expression (f) occurs in a separate thread. The creation of a thread is a synchronization point.&nbsp;All that happens before the thread creation is visible in the thread. Out of symmetry reasons, the same hold between&nbsp;(b) and (e).<\/li>\n<li><strong>dr (data race<\/strong>): Here is the&nbsp;<a href=\"https:\/\/www.modernescpp.com\/index.php\/threads-sharing-data\">data race <\/a>between the reading and writing variables x and y. So the program has undefined behavior.<\/li>\n<\/ul>\n<h4>Why is the execution consistent?<\/h4>\n<p>The execution is consistent because the values x and y are read from the values of x and y in the&nbsp;main thread (a) and (b). If the values would be read from x and y from the separate thread in the expressions (c) and (d), the effect can take place that the values of x and y in (e) and (f) are only partially read. This is not consistent. Or to say it differently. In the concrete execution, x and y get the value 0. You can see that in addition to the expressions (e) and (f).&nbsp;<\/p>\n<p>This guarantee will not hold for the subsequent three executions, which I refer to now.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Second_execution\"><\/span>Second execution<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>&nbsp;<img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-4843\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/second.png\" alt=\"second\" width=\"500\" height=\"350\" style=\"margin: 15px;\" \/><\/p>\n<p>The expression (e) reads in this non-consistent execution the value for y from the expression (d). The writing of (d) will happen parallel to the reading of (e).<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Third_execution\"><\/span>Third execution<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-4844\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/third.png\" alt=\"third\" width=\"500\" height=\"247\" style=\"margin: 15px;\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/third.png 448w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/third-300x148.png 300w\" sizes=\"auto, (max-width: 500px) 100vw, 500px\" \/><\/p>\n<p>That&#8217;s symmetrical to the second execution. The expression (f) reads from the expression (c).<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Fourth_execution\"><\/span>Fourth execution<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-4845\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/08\/four.png\" alt=\"four\" width=\"500\" height=\"376\" style=\"margin: 15px;\" \/><\/p>\n<p>Now all goes wrong. The expressions (e) and (f) read from the expressions (d) and (c).<\/p>\n<h2><span class=\"ez-toc-section\" id=\"A_short_conclusion\"><\/span>A short conclusion<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Although I just used the default configuration of CppMem and I only used the graph, I get a lot of valuable information and insight. In particular, CppMem brings it right to the spot.<\/p>\n<ol>\n<li>All four combinations of x and y are possible: <strong>(0,0), (11,0), (0,2000), (11,2000)<\/strong>.<\/li>\n<li>The program has a data race and, therefore, undefined behavior.<\/li>\n<li>Only one of the four executions is consistent.<\/li>\n<\/ol>\n<h2><span class=\"ez-toc-section\" id=\"Whats_next\"><\/span>What&#8217;s next?<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>What is the most obvious way to synchronize a multithreading program? Of course, to use a mutex. This is the topic of the <a href=\"https:\/\/www.modernescpp.com\/index.php\/ongoing-optimization-locks\">next post<\/a>.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>I described my challenge in the last post. Let&#8217;s &#8216;s start with our process of ongoing optimization. To be sure, I verify my reasoning with CppMem. I&nbsp;once made a big mistake in my presentation at Meeting C++ 2014.<\/p>\n","protected":false},"author":21,"featured_media":4840,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[369],"tags":[486,521],"class_list":["post-4846","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-multithreading-application","tag-cppmem","tag-ongoing-optimization"],"_links":{"self":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/4846","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/comments?post=4846"}],"version-history":[{"count":1,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/4846\/revisions"}],"predecessor-version":[{"id":6958,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/4846\/revisions\/6958"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/media\/4840"}],"wp:attachment":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/media?parent=4846"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/categories?post=4846"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/tags?post=4846"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}