C++20: The Advantages of Modules

Contents[Show]

Modules are one of the four big features of C++20: concepts, ranges, coroutines, and modules. Modules promise a lot: compile-time improvement, isolation of macros, the abolition of header files, and ugly workarounds.

 TimelineCpp20

Why do we need modules? I want to step back and describe which steps are involved in getting an executable.

A Simple Executable

Of course, I have to start with "Hello World".

// helloWorld.cpp

#include <iostream>

int main() {
    std::cout << "Hello World" << std::endl;
}

 

Making an executable helloWorld out of the program helloWorld.cpp increases its size by factor 130.

helloWorld

The number 100 and 12928 in the screenshot stand for the number of bytes.

We should have a basic understanding of what's happening under the hood.

The classical Build Process

The build process consists of three steps: preprocessing, compilation, and linking.

Preprocessing

The preprocessor handles the preprocessor directives such as #include and #define. The preprocessor substitutes #inlude directives with the corresponding header files, and it substitutes the macros (#define). Thanks to directives such as #if, #else, #elif, #ifdef, #ifndef, and #endif parts of the source code can be included or excluded.

This straightforward text substitution process can be observed by using the compiler flag -E on GCC/Clang, or /E on Windows.

preprocessor

WOW!!! The output of the preprocessing step has more than half a million bytes. I don't want to blame GCC; the other compilers are similar verbose: CompilerExplorer.

The output of the preprocessor is the input for the compiler.

Compilation

The compilation is separately performed on each output of the preprocessor. The compiler parses the C++ source code and converts it into assembly code. The generated file is called an object file and it contains the compiled code in binary form. The object file can refer to symbols, which don't have a definition. The object files can be put in archives for later reuse. These archives are called static libraries.

The objects or translation units which the compiler produces are the input for the linker.

Linking

The output of the linker can be an executable or a static or shared library. It's the job of the linker to resolve the references to undefined symbols. Symbols are defined in object files or in libraries. The typical error in this state is that symbols aren't defined or defined more than once.

This build process consisting of the three steps is inherited from C. It works sufficiently good enough if you only have one translation unit. But when you have more than one translation unit, many issues can occur.

Issues of the Build Process

Without any attempt to be complete, here are flaws of the classical build process. Modules overcome these issues.

Repeated substitution of Headers

The preprocessor substitutes #inlude directives with the corresponding header files. Let me change my initial helloWorld.cpp program to make the repetition visible.

I refactored the program and added two source files hello.cpp and world.cpp. The source file hello.cpp provides the function hello and the source file world.cpp provides the function world. Both source files include the corresponding headers. Refactoring means that the program does the same such as the previous program helloWorld.cpp. Simply, the internal structure is changed. Here are the new files:

  • hello.cpp and hello.h

 

// hello.cpp

#include "hello.h"

void hello() {
    std::cout << "hello ";
}

// hello.h

#include <iostream>

void hello();

 

  • world.cpp and world.h

 

// world.cpp

#include "world.h"

void world() {
    std::cout << "world";
}

// world.h

#include <iostream>

void world();

 

  • helloWorld2.cpp

 

// helloWorld2.cpp

#include <iostream>

#include "hello.h"
#include "world.h"

int main() {
    
    hello(); 
    world(); 
    std::cout << std::endl;
    
}

 

 Building and executing the program works as expected:

helloWorld2

Here is the issue. The preprocessor runs on each source file. This means, that the header file <iostream> is included three times in each translation unit. Consequently, each source file is blown up to more than half a million lines.

preprocessorTranslationUnits

This is a waste of compile-time.

In contrast, a module is only imported once and is literally for free.

Isolation from Preprocessor Macros

If there is one consensus in the C++ community, it's the following one: we should get rid of the preprocessor macros. Why? Using a macro is just text substitution, excluding any C++ semantic. Of course, this has many negative consequences: For example, it may depend on in which sequence you include macros or macros can clash with already defined macros or names in your application.

Imagine you have to headers webcolors.h and productinfo.h.

 

// webcolors.h

#define RED 0xFF0000

 

// productinfo.h
#define RED 0

 

When a source file client.cpp includes both headers, the value of the macro RED depends on the sequence the headers are included. This dependency is very error-prone.

In contrast, it makes no difference, in which order you import modules.

Multiple Definition of Symbols

ODR stands for the One Definition Rule and says in the case of a function.

  • A function can have not more than one definition in any translation unit.
  • A function can have not more than one definition in the program.
  • Inline functions with external linkage can be defined in more than one translation. The definitions have to satisfy the requirement that each definition has to be the same.

Let see what my linker has to say when I try to link a program breaking the one definition rule. The following code example has two header file header.h and header2.h. The main program includes the header file header.h twice and, therefore, break the one definition rule, because two definitions of func are included.

// header.h

void func() {}

// header2.h

#include "header.h"

// main.cpp

#include "header.h"
#include "header2.h"
int main() {}

The linker complains about the multiple definitions of func:

odr

We are used to ugly workarounds such as put an include guard around your header. Adding the include guard FUNC_H to the header file header.h solves the issue.

 

// header.h

#ifndef FUNC_H
#define FUNC_H

void func(){}

#endif

 

In contrast, identical symbols with modules are very unlikely.

Before I end this post, I want to summarize the advantages of modules.

Advantages of Modules

  • Modules are only imported once and are literally for free.
  • It makes no difference in which order you import a module.
  • Identical symbols with modules are very unlikely.
  • Modules enable you to express the logical structure of your code. You can explicitly specify names that should be exported or not. Additionally, you can bundle a few modules into a bigger module and provide them to your customer as a logical package.
  • Thanks to modules, there is no need to separate your source code into an interface and an implementation part.

What's next?

Modules promise a lot. In my next post, I define and use my first module.

 

Thanks a lot to my Patreon Supporters: Meeting C++, Matt Braun, Roman Postanciuc, Venkata Ramesh Gudpati, Tobias Zindl, Marko, G Prvulovic, Reinhold Dröge, Abernitzke, Richard Ohnemus, Frank Grimm, Sakib, Broeserl, António Pina, Markus Falkner, Darshan Mody, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Animus24, Jozo Leko, John Breland, espkk, Wolfgang Gärtner, Jon Hess, Christian Wittenhorst, Louis St-Amour, Stephan Roslen, Venkat Nandam, and Jose Francisco.

 

Thanks in particular to: Bitwyre Technologies

 

Thanks in particular to:   crp4

 

   

Get your e-book at Leanpub:

The C++ Standard Library

 

Concurrency With Modern C++

 

Get Both as one Bundle

cover   ConcurrencyCoverFrame   bundle
With C++11, C++14, and C++17 we got a lot of new C++ libraries. In addition, the existing ones are greatly improved. The key idea of my book is to give you the necessary information to the current C++ libraries in about 200 pages. I also included more than 120 source files.  

C++11 is the first C++ standard that deals with concurrency. The story goes on with C++17 and will continue with C++20.

I'll give you a detailed insight into the current and upcoming concurrency in C++. This insight includes the theory and a lot of practice with more than 140 source files.

 

Get my books "The C++ Standard Library" (including C++17) and "Concurrency with Modern C++" in a bundle.

In sum, you get more than 700 pages full of modern C++ and more than 260 source files presenting the standard library and concurrency in practice.

 
Tags: modules

Comments   

0 #1 Marius 2020-05-11 07:10
There is an error in the summary at the end "Modules are only important once and are literally for free." You meant "imported" not "important".
Quote
0 #2 Volkhavvar 2020-05-12 07:01
Like really? Putting #include into header where it's not needed and then wondering why it is included multiple times?
You have told the preprocessor to do exactly that thing.
The second thing is #pragma once directive which is already present in iostream

Third thing is using define to for constants is also outdated better mechanics had been introduced.

After reading your article I got feeling that you are speaking about C language not C++. So if you want to introduce us to the modules you have to choose better examples/bait.
Quote
+2 #3 Peter Adam 2020-05-12 07:33
Welcome to Visual Basic
Quote
0 #4 Artur 2020-05-16 07:40
>> Identical symbols with modules are very unlikely.

From the experience I know that if something is very unlikely it will happen at the least expected moment. And what one is to do if this happens with modules?

>>Thanks to modules, there is no need to separate your source code into an interface and an implementation part.

I actually consider it(split between interface and implementation) as a nice and neat way of organizing your code. Much better than keeping everything that is full class definition in one file like c# does for example or java.

TBH, poorly explained and the only convincing points to me are:
* Modules are only imported once and are literally for free.
* It makes no difference in which order you import a module.
Quote

My Newest E-Books

Course: Modern C++ Concurrency in Practice

Course: C++ Standard Library including C++14 & C++17

Course: Embedded Programming with Modern C++

Course: Generic Programming (Templates)

Course: C++ Fundamentals for Professionals

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 3114

All 3929657

Currently are 196 guests and no members online

Kubik-Rubik Joomla! Extensions

Latest comments