Semaphores in C++20


Semaphores are a synchronization mechanism used to control concurrent access to a shared resource. They also allow it to play ping-pong.



A counting semaphore is a special semaphore that has a counter that is bigger than zero. The counter is initialized in the constructor. Acquiring the semaphore decreases the counter and releasing the semaphore increases the counter. If a thread tries to acquire the semaphore when the counter is zero, the thread will block until another thread increments the counter by releasing the semaphore.

Edsger W. Dijkstra Invented Semaphores

The Dutch computer scientist Edsger W. Dijkstra presented in 1965 the concept of a semaphore. A semaphore is a data structure with a queue and a counter. The counter is initialized to a value equal to or greater than zero. It supports the two operations wait and signal. wait acquires the semaphore and decreases the counter; it blocks the thread acquiring the semaphore if the counter is zero. signal releases the semaphore and increases the counter. Blocked threads are added to the queue to avoid starvation.

Originally, a semaphore is a railway signal.



The original uploader was AmosWolfe at English Wikipedia. - Transferred from en.wikipedia to Commons., CC BY 2.0

Counting Semaphores in C++20

C++20 supports a std::binary_semaphore, which is an alias for a std::counting_semaphore<1>. In this case, the least maximal value is 1. std::binary_semaphores can be used to implement locks.

using binary_semaphore = std::counting_semaphore<1>;

In contrast to a std::mutex, a std::counting_semaphore is not bound to a thread. This means, that the acquire and release call of a semaphore can happen on different threads. The following table presents the interface of a std::counting_semaphore.



The constructor call std::counting_semaphore<10> sem(5) creates a semaphore sem with a at least maximal value of 10 and a counter of 5. The call sem.max() returns the least maximal value. sem.try_aquire_for(relTime) needs a relative time duration; the member function sem.try_acquire_until(absTime) needs an absolute time point. You can read more about time durations and time points in my previous posts to the time libraray: time.  The three calls sem.try_acquire, sem.try_acquire_for, and sem.try_acquire_until return a boolean indicating the success of the calls.

Semaphores are typically used in sender-receiver workflows. For example, initializing the semaphore sem with 0 will block the receivers sem.acquire() call until the sender calls sem.release(). Consequently, the receiver waits for the notification of the sender. A one-time synchronization of threads can easily be implemented using semaphores.


// threadSynchronizationSemaphore.cpp

#include <iostream>
#include <semaphore>
#include <thread>
#include <vector>

std::vector<int> myVec{};

std::counting_semaphore<1> prepareSignal(0);              // (1)

void prepareWork() {

    myVec.insert(myVec.end(), {0, 1, 0, 3});
    std::cout << "Sender: Data prepared."  << '\n';
    prepareSignal.release();                              // (2)

void completeWork() {

    std::cout << "Waiter: Waiting for data." << '\n';
    prepareSignal.acquire();                              // (3)
    myVec[2] = 2;
    std::cout << "Waiter: Complete the work." << '\n';
    for (auto i: myVec) std::cout << i << " ";
    std::cout << '\n';

int main() {

    std::cout << '\n';

    std::thread t1(prepareWork);
    std::thread t2(completeWork);


    std::cout << '\n';


The std::counting_semaphore prepareSignal (1) can have the values 0 oder 1. In the concrete example, it's initialized with 0 (line 1). This means, that the call prepareSignal.release() sets the value to 1 (line 2) and unblocks the call prepareSignal.acquire() (line 3).


Let me make a small performance test by playing ping-pong with semaphores.

A Ping-Pong Game

In my last post "Performance Comparison of Condition Variables and Atomics in C++20", I implemented a ping-pong game. Here is the idea of the game: One thread executes a ping function and the other thread a pong function. The ping thread waits for the notification of the pong thread and sends the notification back to the pong thread. The game stops after 1,000,000 ball changes. I perform each game five times to get comparable performance numbers. Let's start the game:


// pingPongSemaphore.cpp

#include <iostream>
#include <semaphore>
#include <thread>

std::counting_semaphore<1> signal2Ping(0);       // (1)
std::counting_semaphore<1> signal2Pong(0);       // (2)

std::atomic<int> counter{};
constexpr int countlimit = 1'000'000;

void ping() {
    while(counter <= countlimit) {
        signal2Ping.acquire();                  // (5)

void pong() {
    while(counter < countlimit) {
        signal2Ping.release();                  // (3)

int main() {

    auto start = std::chrono::system_clock::now();

    signal2Ping.release();                     // (4)
    std::thread t1(ping);
    std::thread t2(pong);


    std::chrono::duration<double> dur = std::chrono::system_clock::now() - start;
    std::cout << "Duration: " << dur.count() << " seconds" << '\n';



The program pingPongsemaphore.cpp uses two semaphores: signal2Ping and signal2Pong (1 and 2). Both can have the two values 0 and 1 and are initialized with 0. This means when the value is 0 for the semaphore signal2Ping, a call signal2Ping.release() (3 and 4) set the value to 1 and is, therefore, a notification. A  signal2Ping.acquire() (5) call blocks until the value becomes 1. The same argumentation holds for the second semaphore signal2Pong.


On average, the execution time is 0.33 seconds.

Let me summarize the performance numbers for all ping-pong games. This includes the performance numbers of my last post "Performance Comparison of Condition Variables and Atomics in C++20" and this ping-pong game implemented with semaphores.

All Numbers

Condition variables are the slowest way, and atomic flag the fastest way to synchronize threads. The performance of a std::atomic is in between. There is one downside with std::atomic. std::atomic_flag is the only atomic data type that is always lock-free. Semaphores impressed me most because they are nearly as fast as atomic flags.


My C++20 Book

I quoted in this post excerpts of my current C++20 book. If you are curious about the C++20, I'm happy to have new readers. The book is about 70 % done. Of course, you will get automatically all the updates. Here are the cover and the link to it.

CoverFrameWhat's next?

With latches and barriers, we have more convenient coordination types in C++20. Let me present them in my next post.


Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, Marko, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Darshan Mody, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Animus24, Jozo Leko, John Breland, espkk, Wolfgang Gärtner,  Louis St-Amour, Stephan Roslen, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Avi Kohn, Robert Blanch, Truels Wissneth, Kris Kafka, Mario Luoni, Neil Wang, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, and Tobi Heideman.


Thanks in particular to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, and Richard Sargeant.

My special thanks to Embarcadero CBUIDER STUDIO FINAL ICONS 1024 Small



I'm happy to give online-seminars or face-to-face seminars world-wide. Please call me if you have any questions.

Bookable (Online)



Standard Seminars 

Here is a compilation of my standard seminars. These seminars are only meant to give you a first orientation.


Contact Me

Modernes C++,





0 #1 ST 2021-01-21 00:14
Semaphore is a general term for visual signalling at a distance and predates railways and the industrial revolution. Some information here:
0 #2 Aiaal 2021-01-21 00:25
The second screenshot with average execution time is wrong.
0 #3 Wassili 2021-01-21 09:38
Screenshot for PingPong game is missing. Interesting results, it will be a great to put all synchronisations machinery of C++ in one table. Bleiben Sie gesund :-)
0 #4 Jemmy 2021-01-21 18:09
What's the implementation in the kernel that makes semaphores so fast, compared with atomic, condition_variable? Any plan to write articles about kernel-level implementation?

My Newest E-Books

Course: Modern C++ Concurrency in Practice

Course: C++ Standard Library including C++14 & C++17

Course: Embedded Programming with Modern C++

Course: Generic Programming (Templates)

Course: C++ Fundamentals for Professionals

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code


Today 5945

Yesterday 8077

Week 22594

Month 207130

All 5759473

Currently are 172 guests and no members online

Kubik-Rubik Joomla! Extensions

Latest comments