C++ Core Guidelines: More Traps in the Concurrency

Contents[Show]

Concurrency provides many ways to shoot yourself in the foot. The rules for today help you to know these dangers and to overcome them.

trap

First, here are three rules for this post.

They are more rules which I ignore because they have no content.

CP.31: Pass small amounts of data between threads by value, rather than by reference or pointer

This rule is quite apparent; therefore, I can make it short. Passing data to a thread by value gives you immediately two benefits:

  1. There is no sharing and, therefore, no data race is possible. The requirements for a data race are mutable, shared state. Read the details here: C++ Core Guidelines: Rules for Concurrency and Parallelism.
  2. You have not to care about the lifetime of the data. The data stays alive for the lifetime of the created thread. This is in particular important when you detach a thread: C++ Core Guidelines: Taking Care of your Child.

Of course, the crucial question is: What does a small amount of data mean? The C++ core guidelines is not clear about this point. In the rule F.16 For “in” parameters, pass cheaply-copied types by value and others by reference to const to functions, the C++ core guidelines states that 4 * sizeof(int) is a rule of thumb for functions. Meaning, smaller than 4 * sizeof(int) should be passed by value; bigger than 4 * sizeof(int) by reference or pointer.

In the end, you have to measure the performance if necessary.

CP.32: To share ownership between unrelated threads use shared_ptr

Imagine, you have an object which you want to share between unrelated threads. The key question is, who is the owner of the object and, therefore, responsible for releasing the memory? Now you can choose between a memory leak if you don't deallocate the memory or undefined behaviour because you invoked delete more than once. Most of the times, the undefined behaviour ends in a runtime crash.

 

// threadSharesOwnership.cpp

#include <iostream>
#include <thread>

using namespace std::literals::chrono_literals;

struct MyInt{
  int val{2017};
  ~MyInt(){                                 // (4)
    std::cout << "Good Bye" << std::endl;   
  }
};

void showNumber(MyInt* myInt){
    std::cout << myInt->val << std::endl;
}

void threadCreator(){
    MyInt* tmpInt= new MyInt;             // (1)
    
    std::thread t1(showNumber, tmpInt);   // (2)
    std::thread t2(showNumber, tmpInt);   // (3)
    
    t1.detach();
    t2.detach();
}

int main(){

std::cout << std::endl;
threadCreator(); std::this_thread::sleep_for(1s);

std::cout << std::endl;
}

 

Bear with me. The example is intentionally so easy. I let the main thread sleep for one second to be sure that it outlives the lifetime of the child thread t1 and t2. This is, of course, no appropriate synchronisation, but it helps me to make my point. The vital issue of the program is: Who is responsible for the deletion of tmpInt (1)? Thread t1 (2), thread t2 (3), or the function (main thread) itself. Because I can not forecast how long each thread runs I decided to go with a memory leak. Consequentially, the destructor of MyInt (4) is never called:

 threadSharesOwnership

The lifetime issues are quite easy to handle if I use a std::shared_ptr.

threadSharesOwnershipSharedPtr

// threadSharesOwnershipSharedPtr.cpp

#include <iostream>
#include <memory>
#include <thread>

using namespace std::literals::chrono_literals;

struct MyInt{
  int val{2017};
  ~MyInt(){
    std::cout << "Good Bye" << std::endl;
  }
};

void showNumber(std::shared_ptr<MyInt> myInt){    // (2)
    std::cout << myInt->val << std::endl;
}

void threadCreator(){
    auto sharedPtr = std::make_shared<MyInt>();    // (1)
    
    std::thread t1(showNumber, sharedPtr);
    std::thread t2(showNumber, sharedPtr);
    
    t1.detach();
    t2.detach();
}

int main(){
    
    std::cout << std::endl;
    
    threadCreator();
    std::this_thread::sleep_for(1s);
    
    std::cout << std::endl;
    
}

Two small changes to the source code were necessary. First, the pointer in (1) became a std::shared_ptr and second, the function showNumber takes a smart pointer instead of a plain pointer.

CP.41: Minimize thread creation and destruction

How expensive is a thread? Quite expensive! This is the issue behind this rule. Let me first talk about the usual size of a thread and then about the costs of its creation.

Size

A std::thread is a thin wrapper around the native thread. This means I'm interested in the size of a Windows thread and a POSIX thread.

  • Windows systems: the post Thread Stack Size gave me the answer: 1 MB.
  • POSIX systems: the pthread_create man-page provides me with the answer: 2MB. This is the sizes for the i386 and x86_64 architectures. If you want to know the sizes for further architectures that support POSIX, here are they:

 threadStackSize

Creation

I didn't find numbers who much time it takes to create a thread. To get a gut feeling, I made a simple performance test on Linux and Windows.

I used GCC 6.2.1 on a desktop and cl.exe on a laptop for my performance tests. The cl.exe is part of the Microsoft Visual Studio 2017. I compiled the programs with maximum optimisation. This means on Linux the flag O3 and on Windows Ox.

Here is my small test program.

// threadCreationPerformance.cpp

#include <chrono>
#include <iostream>
#include <thread>

static const long long numThreads= 1000000;

int main(){

    auto start = std::chrono::system_clock::now();

    for (volatile int i = 0; i < numThreads; ++i) std::thread([]{}).detach();  // (1)

    std::chrono::duration<double> dur= std::chrono::system_clock::now() - start;
    std::cout << "time: " << dur.count() << " seconds" << std::endl;

}

The program creates 1 million threads which execute an empty lambda function (1). These are the numbers for Linux and Windows:

Linux:

threadCreationLinux

This means that the creation of a thread took about 14.5 sec / 1000000 = 14.5 microseconds on Linux.

Windows:

threadCreationWindows

It took about 44 sec / 1000000 = 44 microseconds on Windows.

To put it the other way around. You can create about 69 thousand threads on Linux and 23 thousand threads on Windows in one second.

What's next?

What is the easiest way to shot yourself in the foot? Use a condition variable! You don't believe? Wait for the next post!

 

 

Thanks a lot to my Patreon Supporters: Eric Pederson, Paul Baxter,  Sai Raghavendra Prasad Poosa, Meeting C++, Matt Braun, Avi Lachmish, Adrian Muntea, and Roman Postanciuc.

 

 

Get your e-book at leanpub:

The C++ Standard Library

 

Concurrency With Modern C++

 

Get Both as one Bundle

cover   ConcurrencyCoverFrame   bundle
With C++11, C++14, and C++17 we got a lot of new C++ libraries. In addition, the existing ones are greatly improved. The key idea of my book is to give you the necessary information to the current C++ libraries in about 200 pages.  

C++11 is the first C++ standard that deals with concurrency. The story goes on with C++17 and will continue with C++20.

I'll give you a detailed insight in the current and the upcoming concurrency in C++. This insight includes the theory and a lot of practice with more the 100 source files.

 

Get my books "The C++ Standard Library" (including C++17) and "Concurrency with Modern C++" in a bundle.

In sum, you get more than 550 pages full of modern C++ and more than 100 source files presenting concurrency in practice.

 

Add comment


Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 45

All 1397279

Currently are 156 guests and no members online

Kubik-Rubik Joomla! Extensions

Latest comments