{"id":5058,"date":"2016-12-06T16:21:55","date_gmt":"2016-12-06T16:21:55","guid":{"rendered":"https:\/\/www.modernescpp.com\/index.php\/memory-and-performance-overhead-of-smart-pointer\/"},"modified":"2023-06-26T12:33:43","modified_gmt":"2023-06-26T12:33:43","slug":"memory-and-performance-overhead-of-smart-pointer","status":"publish","type":"post","link":"https:\/\/www.modernescpp.com\/index.php\/memory-and-performance-overhead-of-smart-pointer\/","title":{"rendered":"Memory and Performance Overhead of Smart Pointers"},"content":{"rendered":"<p>C++11 offers four different smart pointers. I will have a closer look in this post regarding memory and performance overhead on two of them. My first candidate, <span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span> takes care of the lifetime of one resource exclusively; std::shared_ptr shares the ownership of a resource with another std::shared_ptr. I will state the result of my tests before I show you the raw numbers: <strong>There are only a few reasons in modern C++ justifying the memory management with new and delete.<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p><!--more--><\/p>\n<p>&nbsp;<\/p>\n<p>Why? Here are the numbers.<\/p>\n<h2>Memory overhead<\/h2>\n<h3>std::unique_ptr<\/h3>\n<p><span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span> needs, by default, no additional memory. That means <span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span> is as big as its underlying raw pointer. But what does default mean? You can parametrize a <span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span> with a special deleter function. If this deleter function has been stated, you will have an enriched<span style=\"font-family: courier new,courier;\"> std::unique_ptr<\/span> and pay for it. As I mentioned, that is the special use case.<\/p>\n<p>In opposition to the <span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span>, the <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> has a little overhead.<\/p>\n<h3>std::shared_ptr<\/h3>\n<p><span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span>&#8216;s share a resource. They internally use a reference counter. That means if&nbsp;a <span style=\"font-family: courier new,courier;\">std::shared_ptr&nbsp;<\/span>is copied, the reference counter will be increased. The reference count will be decreased if the <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> goes out of scope. Therefore, the <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> needs additional memory for the reference counter. (To be precise, there is an additional reference counter for the <span style=\"font-family: courier new,courier;\">std::weak_ptr<\/span>). That&#8217;s the overhead a<span style=\"font-family: courier new,courier;\"> std::shared_ptr<\/span> has in opposite to a raw pointer.<\/p>\n<\/p>\n<h2>Performance overhead<\/h2>\n<p>The story of the performance is a little bit more involved. Therefore, I let the numbers speak for themself. A simple performance test should give an idea of the overall performance.<\/p>\n<h3>The performance test<\/h3>\n<p>I allocate and deallocate in the test 100&#8217;000&#8217;000 times memory. Of course, I&#8217;m interested in how long it takes.<\/p>\n<p><!-- HTML generated using hilite.me --><\/p>\n<div style=\"background: #ffffff; overflow: auto; width: auto; gray;border-width: .1em .1em .1em .8em;\">\n<table>\n<tbody>\n<tr>\n<td>\n<pre style=\"margin: 0; line-height: 125%;\"> 1\r\n 2\r\n 3\r\n 4\r\n 5\r\n 6\r\n 7\r\n 8\r\n 9\r\n10\r\n11\r\n12\r\n13\r\n14\r\n15\r\n16\r\n17\r\n18\r\n19\r\n20\r\n21\r\n22\r\n23\r\n24<\/pre>\n<\/td>\n<td>\n<pre style=\"margin: 0; line-height: 125%;\"><span style=\"color: #008000;\">\/\/ all.cpp<\/span>\r\n\r\n<span style=\"color: #0000ff;\">#include &lt;chrono&gt;<\/span>\r\n<span style=\"color: #0000ff;\">#include &lt;iostream&gt;<\/span>\r\n\r\n<span style=\"color: #0000ff;\">static<\/span> <span style=\"color: #0000ff;\">const<\/span> <span style=\"color: #2b91af;\">long<\/span> <span style=\"color: #2b91af;\">long<\/span> numInt= 100000000;\r\n\r\n<span style=\"color: #2b91af;\">int<\/span> main(){\r\n\r\n  <span style=\"color: #0000ff;\">auto<\/span> start = std::chrono::system_clock::now();\r\n\r\n  <span style=\"color: #0000ff;\">for<\/span> ( <span style=\"color: #2b91af;\">long<\/span> <span style=\"color: #2b91af;\">long<\/span> i=0 ; i &lt; numInt; ++i){\r\n    <span style=\"color: #2b91af;\">int<\/span>* tmp(<span style=\"color: #0000ff;\">new<\/span> <span style=\"color: #2b91af;\">int<\/span>(i));\r\n    <span style=\"color: #0000ff;\">delete<\/span> tmp;\r\n    <span style=\"color: #008000;\">\/\/ std::shared_ptr&lt;int&gt; tmp(new int(i));<\/span>\r\n    <span style=\"color: #008000;\">\/\/ std::shared_ptr&lt;int&gt; tmp(std::make_shared&lt;int&gt;(i));<\/span>\r\n    <span style=\"color: #008000;\">\/\/ std::unique_ptr&lt;int&gt; tmp(new int(i));<\/span>\r\n    <span style=\"color: #008000;\">\/\/ std::unique_ptr&lt;int&gt; tmp(std::make_unique&lt;int&gt;(i));<\/span>\r\n  }\r\n\r\n  std::chrono::duration&lt;<span style=\"color: #2b91af;\">double<\/span>&gt; dur= std::chrono::system_clock::now() - start;\r\n  std::cout &lt;&lt; <span style=\"color: #a31515;\">\"time native: \"<\/span> &lt;&lt; dur.count() &lt;&lt; <span style=\"color: #a31515;\">\" seconds\"<\/span> &lt;&lt; std::endl;\r\n\r\n}\r\n<\/pre>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<\/div>\n<p>&nbsp;<\/p>\n<p>I compare in my test the explicit calls of <span style=\"font-family: courier new,courier;\">new<\/span> and <span style=\"font-family: courier new,courier;\">delete<\/span> (lines 13 and 14) with the usage of&nbsp; <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> (line 15), <span style=\"font-family: courier new,courier;\">std::make_shared<\/span> (line 16), <span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span> (line 17), and <span style=\"font-family: courier new,courier;\">std::make_unique<\/span> (line 18). In this small program, handling the smart pointer (lines 15 &#8211; 18) is much simpler because it automatically releases its dynamically created <span style=\"font-family: courier new,courier;\">int<\/span> variable if it goes out of scope.&nbsp;&nbsp;<\/p>\n<p>The functions <span style=\"font-family: courier new,courier;\">std::make_shared<\/span> (line 16) and <span style=\"font-family: courier new,courier;\">std::make_unique<\/span>(line 18) are quite handy. They create the smart pointer, respectively. In particular, <span style=\"font-family: courier new,courier;\">std::make_shared<\/span> is very interesting. There are more memory allocations necessary for the creation of a std::shared_ptr. Memory is necessary for the managed resource and the reference counters. <span style=\"font-family: courier new,courier;\">std::make_shared<\/span> make one memory allocation out of them. The performance benefits. <span style=\"font-family: courier new,courier;\">std::make_unique<\/span> is available since C++14; the other smart pointer functionality since C++11.<\/p>\n<p>I use GCC 4.9.2 and a cl.exe for my performance tests. The cl.exe is part of Microsoft Visual Studio 2015. Although cl.exe officially supports only C++11, the helper function <span style=\"font-family: courier new,courier;\">std::make_unique<\/span> is already available. Therefore, I can run my performance test with maximum and without optimization. I must admit that my Windows PC is less powerful than my Linux PC. Therefore, I&#8217;m interested in comparing the raw memory allocation and the smart pointers. I&#8217;m not comparing Windows and Linux.<\/p>\n<p>Here a the raw numbers.<\/p>\n<h3>The raw numbers<\/h3>\n<p>For simplicity reasons, I will not show the screenshots of the programs and present you only the table holding the numbers.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\" size-full wp-image-5057\" src=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/12\/comparisonEng.png\" alt=\"comparisonEng\" width=\"700\" height=\"165\" style=\"margin: 15px;\" srcset=\"https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/12\/comparisonEng.png 901w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/12\/comparisonEng-300x71.png 300w, https:\/\/www.modernescpp.com\/wp-content\/uploads\/2016\/12\/comparisonEng-768x182.png 768w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\" \/><\/p>\n<p>I want to draw a few interesting conclusions from the table.<\/p>\n<ol>\n<li>Optimization matters. In the case of std::make_shared_ptr, the program with maximum optimizations is almost ten times faster. But these observations also hold for the other smart pointers. The&nbsp;optimized program is about 2 to 3 times faster. Interestingly, this observation will not hold for <span style=\"font-family: courier new,courier;\">new<\/span> and <span style=\"font-family: courier new,courier;\">delete.<\/span><\/li>\n<li><span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span>, <span style=\"font-family: courier new,courier;\">std::make_unique,&nbsp;<\/span>and with minor deviations, <span style=\"font-family: courier new,courier;\">std::make_shared<\/span> are in the same performance range as <span style=\"font-family: courier new,courier;\">new<\/span> and <span style=\"font-family: courier new,courier;\">delete.<\/span><\/li>\n<li>You should not use<span style=\"font-family: courier new,courier;\"> std::shared_ptr<\/span> and <span style=\"font-family: courier new,courier;\">std::make_shared<\/span> without optimization. <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> is about two times slower than new and deletes even with optimization<span style=\"font-family: courier new,courier;\">.<\/span><\/li>\n<\/ol>\n<h2>My conclusion<\/h2>\n<ul>\n<li><span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span> has no memory or performance overhead compared to the explicit usage of <span style=\"font-family: courier new,courier;\">new<\/span> and <span style=\"font-family: courier new,courier;\">delete.<\/span> That is great because <span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span> offers an excellent benefit by&nbsp;automatically managing the lifetime of its resource <strong>without any additional cost.<\/strong><\/li>\n<li>My conclusion to <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> is not so easy. Admittedly, the <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> is about two times slower than <span style=\"font-family: courier new,courier;\">new<\/span> and <span style=\"font-family: courier new,courier;\">delete.<\/span> Even <span style=\"font-family: courier new,courier;\">std::make_shared<\/span> has a performance overhead of about 10%. But this calculation is based on the wrong assumptions because <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> models shared ownership. That means only the first <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> has to shoulder the performance and memory overhead. The additional shared pointer shares the infrastructure of managing the underlying object. In particular, only one memory allocation for a <span style=\"font-family: courier new,courier;\">std::shared_ptr<\/span> is necessary.&nbsp;<\/li>\n<\/ul>\n<p>Now I can repeat myself.<strong>&nbsp;<strong> <strong>There are only<span id=\"transmark\"> <\/span>a few reasons in modern C++ justifying the memory management with new and delete.<\/strong><\/strong><\/strong><\/p>\n<h2>What&#8217;s next?<\/h2>\n<p>After this plea for the smart pointers, I will present in the <a href=\"https:\/\/www.modernescpp.com\/index.php\/std-unique-ptr\">next post<\/a> the details about&nbsp;<span style=\"font-family: courier new,courier;\">std::unique_ptr<\/span>.<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>C++11 offers four different smart pointers. I will have a closer look in this post regarding memory and performance overhead on two of them. My first candidate, std::unique_ptr takes care of the lifetime of one resource exclusively; std::shared_ptr shares the ownership of a resource with another std::shared_ptr. I will state the result of my tests [&hellip;]<\/p>\n","protected":false},"author":21,"featured_media":5057,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[364],"tags":[498,400,399,401],"class_list":["post-5058","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-embedded","tag-memory","tag-shared_ptr","tag-smart-pointers","tag-unique_ptr"],"_links":{"self":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/5058","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/users\/21"}],"replies":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/comments?post=5058"}],"version-history":[{"count":1,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/5058\/revisions"}],"predecessor-version":[{"id":6917,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/posts\/5058\/revisions\/6917"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/media\/5057"}],"wp:attachment":[{"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/media?parent=5058"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/categories?post=5058"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.modernescpp.com\/index.php\/wp-json\/wp\/v2\/tags?post=5058"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}