A lot was written about the advantages of move semantics to copy semantics. Instead of an expensive copy operation, you can use a cheap move operation. But what does that mean? In this post, I will compare the copy and move semantics performance for the Standard Template Library (STL) containers.
Before I show the number, I will provide some background information.
Copy versus Move Semantics
The subtle difference is if you create with a copy or move semantic a new object based on an existing one, the copy semantic will copy the elements of the resource, and the move semantic will move the elements of the resource. Of course, copying is expensive, and moving is cheap. But there are additional serious consequences.
- With copy semantic, it can happen that a std::bad_alloc will be thrown because your program is out of memory.
- The resource of the move operation is afterward in a “valid but unspecified state“.
The second point is very nice to show with std::string.
At first, the classical copy semantics.
After the copy operation, strings str1 and str2 have the same content, “ABCDEF“. So, what’s the difference between the move semantics?
The string str1 opposes the copy semantic afterward empty “”. This is not guaranteed but often the case. I explicitly requested the move semantic with the function std::move. The compiler will automatically perform the move semantic if it is sure that the source of the move semantic is not needed anymore.
I will explicitly request the move semantic in my program by using std::move.
The Performance Differences
I will take the naive position in my post and compare the performance difference between the copy and move semantics of the STL containers. My comparison will include the std::string. I will ignore the associative containers, which can have more equal keys. I’m particularly interested in the performance ratio between the copy and move semantics of the containers.
The boundary conditions
The differences were not so dramatic between the program with maximum optimization and without optimization; therefore, I will, for simplicity reasons, only provide the results for the executable with maximum optimization. I use a GCC 4.9.2 compiler and the cl.exe compiler, which is part of Microsoft Visual Studio 2015. Both platforms are 64-bit. Therefore, the executables are built for 64-bit.
We have a lot of containers in the STL. Therefore, the program is a little bit lengthy.
The idea of the program is to initialize the containers with 10 million elements. Of course, the initialization will happen with copy and move semantics. The performance measurement occurs in the function template measurePerformane (lines 21 – 44). The function takes as an argument the container and the name of the container. Thanks to the Chrono library I can measure how long the copy initialization (line 27) and the move initialization (line 34) take. Ultimately, I’m interested in the ratio between the copy and move semantics (line 40).
What’s happening in the main function? I create each container with its scope so that it will be automatically released. Therefore, myArray (line 51) will automatically be released and the end of its scope (line 53). Because the containers are quite big, releasing their memory is a must. I claimed that each container has 10 million elements. That will not hold for myArray. Because myArray will not be allocated on the heap, I must dramatically reduce its size. But now to the remaining containers. With std::vector, std::deque, std::list, and std::forward_list there are in lines 55 – 73 the remaining sequential containers. In lines 75 – 78 std::string follows. The rest are the associative containers. I have to pay attention to one characteristic of the associative container. To have unique keys and therefore the size 10 million, I use the numbers 0 to 9999999 as keys. The function std::iota does the job.
The results of std::array are not so meaningful. On the one hand, std::array is not so big; on the other hand, the time difference on Windows is not measurable with the clock std::system_clock.
What insight can I derive from the numbers?
- Sequential container: std::vector is the fastest container in case of copying or moving.
- Sequential versus associative container: Copying the sequential container on Linux and Windows is faster.
- Copy versus move semantics: The differences between the copy and move semantics are huge. That holds, in particular, true for the associative containers.
- std::string: The std::string on Linux behaves strangely. On the one hand, copying is very fast; on the other hand, moving is only 16 times faster than copying. The becomes even more strange if I compile and execute the program without optimization. I get the result on Linux that the move semantics is only 1.5 times faster than the copy semantics. But these numbers are in strong contradiction to the numbers on Windows. On Windows, the move semantic is 15000 times faster than the copy semantic.
The riddle around std::string
The performance difference between the copy and move semantics on Linux and Windows is quickly explained. My GCC implements the std::string according to copy-on-write (cow). This is not conformant to the C++11 standard. But cl.exe implements std::string according to the C++11 standard. I will get different numbers if I compile the program with GCC 6.1 and enable C++11. GCC’s std::string implementation is since 5.1 conformant to the C++11 standard.
Here are the numbers with the online compiler on en.cppreference.com.
Now, there is a big difference between the copy and move semantics.
I hope that was the motivation for the move semantics. In the next post, I will pick two nice characteristics of the move semantic.
Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Jozo Leko, John Breland, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Kris Kafka, Mario Luoni, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Daniel Hufschläger, Alessandro Pezzato, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Michael Dunsky, Leo Goodstadt, John Wiederhirn, Yacob Cohen-Arazi, Florian Tischler, Robin Furness, Michael Young, Holger Detering, Bernd Mühlhaus, Matthieu Bolt, Stephen Kelley, Kyle Dean, Tusar Palauri, Dmitry Farberov, Juan Dent, George Liao, Daniel Ceperley, Jon T Hess, Stephen Totten, Wolfgang Fütterer, Matthias Grün, Phillip Diekmann, Ben Atakora, Ann Shatoff, Rob North, Bhavith C Achar, and Marco Parri Empoli.
Thanks, in particular, to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, John Nebel, Mipko, Alicja Kaminska, Slavko Radman, and David Poole.
|My special thanks to Embarcadero|
|My special thanks to PVS-Studio|
|My special thanks to Tipi.build|
|My special thanks to Take Up Code|
I’m happy to give online seminars or face-to-face seminars worldwide. Please call me if you have any questions.
- Embedded Programmierung mit modernem C++ 12.12.2023 – 14.12.2023 (Präsenzschulung, Termingarantie)
Standard Seminars (English/German)
Here is a compilation of my standard seminars. These seminars are only meant to give you a first orientation.
- C++ – The Core Language
- C++ – The Standard Library
- C++ – Compact
- C++11 and C++14
- Concurrency with Modern C++
- Design Pattern and Architectural Pattern with C++
- Embedded Programming with Modern C++
- Generic Programming (Templates) with C++
- Clean Code with Modern C++
- Phone: +49 7472 917441
- Mobil:: +49 176 5506 5086
- Mail: schulung@ModernesCpp.de
- German Seminar Page: www.ModernesCpp.de
- Mentoring Page: www.ModernesCpp.org
Modernes C++ Mentoring,