The Ranges Library in C++20: Design Choices
Thanks to the ranges library, working with the Standard Template Library (STL) is much more comfortable and powerful. First of all, the algorithms of the ranges library are lazy, can work directly on the container, and can be composed. Additionally, the ranges library made a few unique design choices, you must know.
Before I dive deep into the ranges library in C++20, I want to recap in a few sentences the three main features of the ranges: The algorithms of the ranges can directly operate on the container, evaluate their arguments lazily, and can be composed.
Direct on the Container
The ranges library enables a container such as std::ranges::sort
can directly operate on the container:
// sortRanges.cpp #include <algorithm> #include <iostream> #include <vector> int main() { std::vector<int> myVec{-3, 5, 0, 7, -4}; std::ranges::sort(myVec); // (1) for (auto v: myVec) std::cout << v << " "; // -4, -3, 0, 5, 7 }
On the contrary, the classic std::sort operates on a range defined by two iterators: std:sort(myVec.begin(), myVec.end()
).
The ranges algorithms are lazy and can be composed.
Lazy Evaluation and Function Composition
The following program primesLazy.cpp
applies both features. It creates the first ten prime numbers, starting with one million.
// primesLazy.cpp #include <iostream> #include <ranges> bool isPrime(int i) { for (int j=2; j*j <= i; ++j){ if (i % j == 0) return false; } return true; } int main() { std::cout << '\n'; auto odd = [](int i){ return i % 2 == 1; }; for (int i: std::views::iota(1'000'000) | std::views::filter(odd) | std::views::filter(isPrime) | std::views::take(10)) { std::cout << i << " "; } std::cout << '\n'; }
You have to read the function composition from left to right: I create an infinite data stream starting with 1’000’000 (std::views::iota(1'000'000)
) and apply two filters. Each filter needs a predicate. The first filter let odd elements pass (std::views::filter(odd)
), and the second filter lets the prime numbers pass (std::views::filter(isPrime)
). To end the infinite data stream, I stop after ten numbers (std::views::take(10)
). Finally, here are the first ten prime numbers starting with one million.
You may ask: Who starts the processing of this data pipeline? Now, it goes from right to left. The data sink (std::views::take(10)
) want to have the next value and ask its predecessor. This request continues until the range-based for-loop produces the next value. The range-based for-loop can produce an infinite data stream, but it only produces values on request. This is lazy evaluation.
This was my short recap. When you want to read more about the ranges library, including sentinels, projection, and concepts, read my previous posts:
- The Ranges Library
- Functional Pattern with the Ranges Library
- The Ranges Library in C++20: More Details
- Projections with Ranges
- Sentinels and Concepts with Ranges Algorithms
- Improved Iterators with Ranges
- Pythonic with the Ranges Library
- Pythons range Function, the Second
- Pythons map Function
Now, let me write about something new.
Modernes C++ Mentoring
Do you want to stay informed: Subscribe.
Design Choices
For efficiency reasons, the ranges library has some unique design choices. It’s important to know and follow theses.
When you study the begin
member function of std::ranges::filter_view
, you find code equivalent to the following one:
if constexpr (!ranges::forward_range<V>) return /* iterator */{*this, ranges::find_if(base_, std::ref(*pred_))}; else { if (!begin_.has_value()) begin_ = ranges::find_if(base_, std::ref(*pred_)); // caching return /* iterator */{*this, begin_.value())}; }
Let’s analyze the nested if
clause. First, the compiler checks if begin_.has_value()
is true
. If not, it determines begin_
. This means that this member function caches the result within the std::ranges::filter_view
object for use on subsequent calls. This caching has interesting consequences. Let me exemplify this with a code snippet.
// cachingRanges.cpp #include <numeric> #include <iostream> #include <ranges> #include <vector> int main() { std::vector<int> vec(1'000'000); std::iota(vec.begin(), vec.end(), 0); for (int i: vec | std::views::filter([](auto v) { return v > 1000; }) | std::views::take(5)) { std::cout << i << " "; // 1001 1002 1003 1004 1005 } }
The first call of std::views::filter([](auto v) { return v > 1000; })
determines the begin iterator and reuses it in subsequent calls. The benefit of this caching is obvious. Many subsequent iterations of the pipeline are spared. But there are also severe drawbacks: cache issues and constness issues.
Cache
Here are the two important cache rules for ranges:
- Don’t use a view on modified ranges.
- Don’t copy a view.
Let me play with the previous program cachingRanges.cpp
and break both rules:
// cachingIssuesRanges.cpp #include <concepts> #include <forward_list> #include <iostream> #include <numeric> #include <ranges> #include <vector> void printElements(std::ranges::input_range auto&& rang) { for (int i: rang) { std::cout << i << " "; } std::cout << '\n'; } int main() { std::cout << '\n'; std::vector<int> vec{-3, 10, 4, -7, 9, 0, 5, -5}; // (1) std::forward_list<int> forL{-3, 10, 4, -7, 9, 0, 5, -5}; // (2) auto first5Vector = vec | std::views::filter([](auto v) { return v > 0; }) // (3) | std::views::take(5); auto first5ForList = forL | std::views::filter([](auto v) { return v > 0; }) // (4) | std::views::take(5); printElements(first5Vector); // 10 4 9 5 // (5) printElements(first5ForList); // 10 4 9 5 // (6) std::cout << '\n'; vec.insert(vec.begin(), 10); forL.insert_after(forL.before_begin(), 10); printElements(first5Vector); // -3 10 4 9 5 printElements(first5ForList); // 10 4 9 5 std::cout << '\n'; auto first5VectorCopy{first5Vector}; // (7) auto first5ForListCopy{first5ForList}; // (8) printElements(first5VectorCopy); // -3 10 4 9 5 printElements(first5ForListCopy); // 10 10 4 9 5 std::cout << '\n'; }
To make it easier to follow the problem, I wrote the output directly in the source code. The program does the following steps with a std::vector
and a std::forward_list
. First, both containers are initialized with the initializer list {-3, 10, 4, -7, 9, 0, 5, -5
} (lines 1 and 2). Then, I create two views (lines 3 and 4). Both views first5Vector
and first5ForList
consist of the first 5 elements greater than 0. Lines (5) and (6) display the corresponding values.
Now, I break the first rule: “Don’t use a view on modified ranges.” I insert 10
at the beginning of both containers. Afterward, first5Vector
displays the -3
and first5ForList
ignores the added 10
. After the break of the second rule, “Don’t copy a view.” in lines (7) and (8), the cache of first5ForListCopy
is invalidated. first5VectorCopy
still shows the wrong numbers. Finally, here is the output of the program.
Here is a simple rule of thumb: Use views directly after you have defined them.
You may have noticed that the function printElements
takes it arguments by universal reference, aka forwarding reference.
What’s next?
Let me write in my next post, why a function should take an arbitrary view by universal reference.
Here is a Note on my Behalf
I am sorry to inform you that I have ALS, a very serious progressive nerve condition. Therefore, I am unsure how long I can continue this blog. Currently, I can only travel to trainings or conferences with the help of my wife. We (my family and I) have decided to deal aggressively with this challenge. Therefore, it was important for me to let you, my loyal readers, in on my illness.
Additionally, if you know anything about a clinical trial or some new treatment, please write me an e-mail. In Germany, the only medicament is Riluzol.
Modernes C++
Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Jozo Leko, John Breland, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Mario Luoni, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Daniel Hufschläger, Alessandro Pezzato, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Michael Dunsky, Leo Goodstadt, John Wiederhirn, Yacob Cohen-Arazi, Florian Tischler, Robin Furness, Michael Young, Holger Detering, Bernd Mühlhaus, Stephen Kelley, Kyle Dean, Tusar Palauri, Juan Dent, George Liao, Daniel Ceperley, Jon T Hess, Stephen Totten, Wolfgang Fütterer, Matthias Grün, Phillip Diekmann, Ben Atakora, Ann Shatoff, Rob North, Bhavith C Achar, Marco Parri Empoli, Philipp Lenk, Charles-Jianye Chen, Keith Jeffery,and Matt Godbolt.
Thanks, in particular, to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, John Nebel, Mipko, Alicja Kaminska, Slavko Radman, and David Poole.
My special thanks to Embarcadero | |
My special thanks to PVS-Studio | |
My special thanks to Tipi.build | |
My special thanks to Take Up Code | |
My special thanks to SHAVEDYAKS |
Modernes C++ GmbH
Modernes C++ Mentoring (English)
Rainer Grimm
Yalovastraße 20
72108 Rottenburg
Mail: schulung@ModernesCpp.de
Mentoring: www.ModernesCpp.org
Modernes C++ Mentoring,